PyPDF2 报错:‘latin-1‘ codec can‘t encode characters in position 8-9: ordinal not in rang(256)

在用 PyPDF2 库时遇到过报如下错误:‘latin-1‘ codec can‘t encode characters in position 8-9: ordinal not in rang(256)
解决方法:
1. 修改 pypdf2 包中的 generic.py 文件
generic.py 文件第 488 行原文

try:
   return NameObject(name.decode(‘utf-8’))
   except (UnicodeEncodeError, UnicodeDecodeError) as e:
   # Name objects should represent irregular characters
   # with a ‘#’ followed by the symbol’s hex number
   if not pdf.strict:
      warnings.warn(“Illegal character in Name Object”, utils.PdfReadWarning)
      return NameObject(name)
   else:
      raise utils.PdfReadError(“Illegal character in Name Object”)

改成

try:
     return NameObject(name.decode(‘utf-8’))
 except (UnicodeEncodeError, UnicodeDecodeError) as e:
     try:
         return NameObject(name.decode(‘gbk’))
     except (UnicodeEncodeError, UnicodeDecodeError) as e:
         # Name objects should represent irregular characters
         # with a ‘#’ followed by the symbol’s hex number
         if not pdf.strict:
             warnings.warn(“Illegal character in Name Object”, utils.PdfReadWarning)
             return NameObject(name)
         else:
             raise utils.PdfReadError(“Illegal character in Name Object”)

2、修改 pypdf2 包中的 utils.py 文件

utils.py238 行原文

r = s.encode(‘latin-1’)
 if len(s) < 2:
        bc[s] = r
 return r

改成

try:
    r = s.encode(‘latin-1’)
    if len(s) < 2:
        bc[s] = r
    return r
except Exception as e:
    print(s)
    r = s.encode(‘utf-8’)
    if len(s) < 2:
        bc[s] = r
  return r

即可解决·