获取类及函数工具

piaodoo 编程教程 2020-02-22 22:14:56 1538 0 python教程

本文来源吾爱破解论坛

本帖最后由 ymhld 于 2020-1-30 14:18 编辑

已经修正并生成exe

一、写小程序的原因
近一阵子在学习python的PDF操作，网上找来一些代码，结果发现在引用时，往往出错，而且错误是在import上
按理来说，import模块是非常简单的，拿来用就可以了，但引用过程中，往往因为模块升级的原因，函数和类对应不上

二、例子
from pdfminer.pdfparser import PDFParser,PDFDocument

ImportError: cannot import name'PDFDocument' from 'pdfminer.pdfparser' (C:\Uses\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\pdfminr\pdfparser.py)

from pdfminer.pdfinterp importPDFTextExtractionNotAllowed

ImportError: cannot import name'PDFTextExtractionNotAllowed' from 'pdfminer.pdfinterp'(C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\pdfminer\pdfinterp.py)

以上就是出错的例子，而且错误频出，网上找的代码基本相同，又找不错误在哪里，因此十分恼火。于是想找一找到底是什么原因。于是顺藤摸瓜，找到了import中的py文件位置，即C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\pdfminer
发现，各类用pip install 的文件全在这里面，在此文件夹下，于是想一劳永逸，把里面的函数和类都拾取出来
C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages

三、几处需要说明的地方：
1、要用repr(os.path)取得系统应用的目录， os.path的形式为<module 'ntpath' from 'C:\\Users\\Administrator\\AppData\\Local\\Programs\\Python\\Python38\\lib\\ntpath.py'>
要进行转化和正则提取，感谢在小程序的制作过程中， @lijt16 帮助正则表达式的写法。
用os.path.join(class_path2,'site-packages\\\\')取得了正确的目录。
2、用遍历的方式取得C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\目录下的文件，其中以__开始的文件名或单独的.py文件并没有进行遍历，因为发现里面并没有什么有用的东西，如果需要后续再处理。
3、提取过程中，用UTF-8的格式读取文件有出错的，也PASS了。
其余就是格式调整的问题了，不再赘述。

四、结果
搜寻的结果存在output.txt文件之中，结果如下：

包  名：adodbapi

文件名：       adodbapi.py

      函数名： Dispatch(dispatch):
      函数名： getIndexedValue(obj,index):
      函数名： Dispatch(dispatch):
      函数名： getIndexedValue(obj,index):
      函数名： make_COM_connecter():
      函数名： connect(*args, **kwargs): # --> a db-api connection object
      函数名： ultIsolationLevel = adc.adXactReadCommitted
      函数名： ultCursorLocation = adc.adUseClient # changed from adUseServer as of v 2.3.0
      函数名： format_parameters(ADOparameters, show_value=False):
      函数名： counter():
类  名：----Connection(object):
      函数名： dbapi(self): # a proposed db-api version 3 extension.
      函数名： connect(self, kwargs, connection_maker=make_COM_connecter):
      函数名： close(self):
      函数名： commit(self):
      函数名： cursor(self):
      函数名： printADOerrors(self):
      函数名： get_table_names(self):
类  名：----Cursor(object):
      函数名： prepare(self, operation):
      函数名： build_column_info(self, recordset):
      函数名： get_description(self):
      函数名： format_description(self,d):
      函数名： close(self, dont_tell_me=False):
..........

那么最初的错误能不能找到呢。

from pdfminer.pdfdocument importPDFDocument包名，文件名  import 模块名
在文件中寻找PDFDocument，是在pdfdocument文件中的，改正！！
from pdfminer.pdfdocument importPDFTextExtractionNotAllowed 文件名：       pdfdocument.py类  名：----PDFNoValidXRef(PDFSyntaxError):类  名：----PDFNoOutlines(PDFException):类  名：----PDFDestinationNotFound(PDFException):类  名：----PDFEncryptionError(PDFException):类  名：----PDFPasswordIncorrect(PDFEncryptionError):类  名：----PDFTextExtractionNotAllowed(PDFEncryptionError):类  名：----PDFBaseXRef:

类名 PDFTextExtractionNotAllowed是在pdfdocumen中的，改正！！！

而在pdfinterp.py中并没有找到 PDFTextExtractionNotAllowed
文件名：       pdfinterp.py
类  名：----PDFResourceError(PDFException):
类  名：----PDFInterpreterError(PDFException):
类  名：----PDFTextState:
      函数名： copy(self):
      函数名： reset(self):
类  名：----PDFGraphicState:
      函数名： copy(self):
类  名：----PDFResourceManager:
      函数名： get_procset(self, procs):
      函数名： get_cmap(self, cmapname,strict=False):
      函数名： get_font(self, objid, spec):类  名：----PDFContentParser(PSStackParser):       函数名： fillfp(self):       函数名： seek(self, pos):       函数名： fillbuf(self):       函数名： get_inline_data(self, pos,target=b'EI'):       函数名： flush(self):       函数名： do_keyword(self, pos, token):类  名：----PDFPageInterpreter:       函数名： dup(self):       函数名： init_resources(self, resources):       函数名： get_colorspace(spec):       函数名： init_state(self, ctm):       函数名： push(self, obj):       函数名： pop(self, n):       函数名： get_current_state(self):       函数名： set_current_state(self, state):       函数名： do_q(self):       函数名： do_Q(self):       函数名： do_cm(self, a1, b1, c1, d1, e1, f1):       函数名： do_w(self, linewidth):       函数名： do_J(self, linecap):       函数名： do_j(self, linejoin):       函数名： do_M(self, miterlimit):       函数名： do_d(self, dash, phase):       函数名： do_ri(self, intent):       函数名： do_i(self, flatness):       函数名： do_gs(self, name):       函数名： do_m(self, x, y):       函数名： do_l(self, x, y):       函数名： do_c(self, x1, y1, x2, y2, x3, y3):       函数名： do_v(self, x2, y2, x3, y3):       函数名： do_y(self, x1, y1, x3, y3):       函数名： do_h(self):       函数名： do_re(self, x, y, w, h):       函数名： do_S(self):       函数名： do_s(self):       函数名： do_f(self):       函数名： do_F(self):       函数名： do_f_a(self):       函数名： do_B(self):       函数名： do_B_a(self):       函数名： do_b(self):       函数名： do_b_a(self):       函数名： do_n(self):       函数名： do_W(self):       函数名： do_W_a(self):       函数名： do_CS(self, name):       函数名： do_cs(self, name):       函数名： do_G(self, gray):       函数名： do_g(self, gray):       函数名： do_RG(self, r, g, b):       函数名： do_rg(self, r, g, b):       函数名： do_K(self, c, m, y, k):       函数名： do_k(self, c, m, y, k):       函数名： do_SCN(self):       函数名： do_scn(self):       函数名： do_SC(self):       函数名： do_sc(self):       函数名： do_sh(self, name):       函数名： do_BT(self):       函数名： do_ET(self):       函数名： do_BX(self):       函数名： do_EX(self):       函数名： do_MP(self, tag):       函数名： do_DP(self, tag, props):       函数名： do_BMC(self, tag):       函数名： do_BDC(self, tag, props):       函数名： do_EMC(self):       函数名： do_Tc(self, space):       函数名： do_Tw(self, space):       函数名： do_Tz(self, scale):       函数名： do_TL(self, leading):       函数名： do_Tf(self, fontid, fontsize):       函数名： do_Tr(self, render):       函数名： do_Ts(self, rise):       函数名： do_Td(self, tx, ty):       函数名： do_TD(self, tx, ty):       函数名： do_Tm(self, a, b, c, d, e, f):       函数名： do_T_a(self):       函数名： do_TJ(self, seq):       函数名： do_Tj(self, s):       函数名： do__q(self, s):       函数名： do__w(self, aw, ac, s):       函数名： do_BI(self):       函数名： do_ID(self):       函数名： do_EI(self, obj):       函数名： do_Do(self, xobjid):       函数名： process_page(self, page):       函数名： render_contents(self, resources,streams, ctm=MATRIX_IDENTITY):       函数名： execute(self, streams): streams):

最后把成品奉献给各位坛友，祝春节愉快！！
已经修正了路径寻找，exe和zip可以用了

https://www.lanzous.com/i8x5hqj

https://www.lanzous.com/i8x5hra

自己机器中生成的output.txt
https://www.lanzous.com/i8wovgb

代码写得乱七八糟，请各位大佬指点
[Python] 纯文本查看 复制代码

#-*-coding:GBK -*- 
#


print("                                                              ")
print("      _____    _____      _____  _____                   ____ ")
print(" ___|\     \  |\    \    /    /||\    \   _____         |    |")
print("|    |\     \ | \    \  /    / || |    | /    /|        |    |")
print("|    | |     ||  \____\/    /  /\/     / |    ||        |    |")
print("|    | /_ _ /  \ |    /    /  / /     /_  \   \/  ____  |    |")
print("|    |\    \    \|___/    /  / |     // \  \   \ |    | |    |")
print("|    | |    |       /    /  /  |    |/   \ |    ||    | |    |")
print("|____|/____/|      /____/  /   |\ ___/\   \|   /||\____\|____|")
print("|    /     ||     |`    | /    | |   | \______/ || |    |    |")
print("|____|_____|/     |_____|/      \|___|/\ |    | | \|____|____|")
print(" \(    )/           )/            \(   \|____|/     \(   )/  ")
print("   '    '            '              '      )/         '   '   ")
print("                                           '                 ") 
"""

"""
import os
import re
import time



#print (sys.argv[0])  #运行的文件名
#print (os.path.split( os.path.realpath( sys.argv[0] ) )) #数组格式运行的目录及文件名
#print ("##################################################")
#ScriptPath = os.path.split( os.path.realpath( sys.argv[0] ) )[0]
#print (ScriptPath) #获取的目录

def walkFile(file,file1):
        
        path = os.listdir(file)
# 常见错误：直接使用os.listdir()的返回值当做os.path.isdir()和os.path.isfile()的入参

# 正确用法：需要先使用python路径拼接os.path.join()函数，将os.listdir()返回的名称拼接成文件或目录的绝对路径再传入os.path.isdir()和os.path.isfile().

        for root, dirs, files in os.walk(file):
        # root 表示当前正在访问的文件夹路径
        # dirs 表示该文件夹下的子目录名list
        # files 表示该文件夹下的文件list
        # 遍历文件
                for f in files:
                        filenm=os.path.join(root, f)
                        (filename,fileext)=os.path.splitext(f)
                        if filename[:2]!="__" and fileext==".py" and filename[:1]!="_":
                                #print(filenm)
                                print ("\n文件名：\t",os.path.basename(filenm))
                                file1.write("\n文件名：\t"+os.path.basename(filenm))
                                walkPY(filenm,file1)
                                #time.sleep(1)


def walkPY(file,file1): #读取.py并写入文件
        try:
                with open(file,'r',encoding='UTF-8') as lines:
                        defnames=[]
                        classname=""
                        for line in lines:
                                line=line.strip()
                                #print (line,line[:5],line[:5]=="class",line[:3],line[:3]=="def")
                                if line[:5]=="class":
                                        classname=line[6:]
                                        print ("\n类  名：----",classname)
                                        file1.write("\n类  名：----"+classname)
                                if line[:3]=="def":
                                        if line[4:6]!="__" and line[4:5]!="_":
                                                defnames.append(line[4:])
                                                print("\n\t函数名： "+line[4:])
                                                file1.write("\n\t函数名： "+line[4:])

                                
                        print ()
                        file1.write("")
                        #print (f.read())  #一次性读取
        except UnicodeDecodeError:
                pass
        
        



def walkpath(class_path,file):  #file为路径
        #dirP=[]
        path = os.listdir(class_path)
        print (path)
        for p in path:
                if os.path.isdir(os.path.join(class_path,p)):
                    print ("\n包  名：",p)
                    file.write("\n\n\n包  名："+p)
                    print (class_path,file)
                    walkFile(os.path.join(class_path,p),file)
                    #dirP.append(str(p))






def main():
#import sys
#运行目录
#CurrentPath = os.getcwd()
#print (CurrentPath)
#当前脚本目录
#print ("##################################################")
        #print (repr(os.path))  #运行的文件目录
        #class_path=r"C:\\Users\\Administrator\\AppData\\Local\\Programs\\Python\\Python38\\Lib\\site-packages\\"
        class_path2=re.compile('''from\s'(.+\\\\)''').findall(repr(os.path))[0]
        class_path=os.path.join(class_path2,'site-packages\\\\')
        #print ('获取运行路径',class_path2)
        #print (class_path)
        #time.sleep(5)
        with open(file='output.txt',mode='w',encoding='UTF-8') as file:
                
                walkpath(class_path,file)
        file.close
        

main()

版权声明：

本站所有资源均为站长或网友整理自互联网或站长购买自互联网，站长无法分辨资源版权出自何处，所以不承担任何版权以及其他问题带来的法律责任，如有侵权或者其他问题请联系站长删除！站长QQ754403226 谢谢。

有关影视版权：本站只供百度云网盘资源，版权均属于影片公司所有，请在下载后24小时删除，切勿用于商业用途。本站所有资源信息均从互联网搜索而来，本站不对显示的内容承担责任，如您认为本站页面信息侵犯了您的权益，请附上版权证明邮件告知【754403226@qq.com】，在收到邮件后72小时内删除。本文链接：https://www.piaodoo.com/7943.html