Merge pull request #10 from dblate/patch-9

Update 5.在无标记数据集上进行预训练.md
This commit is contained in:
long_long_ago 2025-04-17 09:40:22 +08:00 committed by GitHub
commit bfef672f28
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 2 additions and 2 deletions

View File

@ -422,7 +422,7 @@ tensor(10.7940)
```python ```python
file_path = "the-verdict.txt" file_path = "the-verdict.txt"
with open(file_path, "r", encoding="utf-8") as file: with open(file_path, "r", encoding="utf-8") as file:
text_data = file.read() text_data = file.read()
``` ```
加载数据集后,我们可以查看其中的字符数和 token 数: 加载数据集后,我们可以查看其中的字符数和 token 数: