PyPDF2 报错:‘latin-1‘ codec can‘t encode characters in position 8-9: ordinal not in rang(256)
在用 PyPDF2 库时遇到过报如下错误:‘latin-1‘ codec can‘t encode characters in position 8-9: ordinal not in rang(256)
解决方法:
1. 修改 pypdf2 包中的 generic.py 文件
generic.py 文件第 488 行原文
try:
return NameObject(name.decode(‘utf-8’))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
# Name objects should represent irregular characters
# with a ‘#’ followed by the symbol’s hex number
if not pdf.strict:
warnings.warn(“Illegal character in Name Object”, utils.PdfReadWarning)
return NameObject(name)
else:
raise utils.PdfReadError(“Illegal character in Name Object”)
改成
try:
return NameObject(name.decode(‘utf-8’))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
try:
return NameObject(name.decode(‘gbk’))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
# Name objects should represent irregular characters
# with a ‘#’ followed by the symbol’s hex number
if not pdf.strict:
warnings.warn(“Illegal character in Name Object”, utils.PdfReadWarning)
return NameObject(name)
else:
raise utils.PdfReadError(“Illegal character in Name Object”)
2、修改 pypdf2 包中的 utils.py 文件
utils.py238 行原文
r = s.encode(‘latin-1’)
if len(s) < 2:
bc[s] = r
return r
改成
try:
r = s.encode(‘latin-1’)
if len(s) < 2:
bc[s] = r
return r
except Exception as e:
print(s)
r = s.encode(‘utf-8’)
if len(s) < 2:
bc[s] = r
return r
即可解决·
强强强