python对txt修改

首页 >> 正文

python对txt修改

来源：baiyundou.net 日期：2024-09-20

作者：siseniao

从该文章改进：https://post.smzdm.com/p/an370emp/?zdm_ss=Android_1106136211_&send_by=1106136211&from=other&invite_code=zdmwffzv7winv

原文每次扫描都需要重新计算MD5，对于大文件来说，磁盘消耗较大，增加了缓存文件存储md5，每次扫描只计算新文件，提高效率。

不废话，直接贴代码：

import os

import hashlib

# 只删除以下列表中的重复文件类型.如果想删除其他类型的文件,自己添加一下就行了

file_type = ['.jpg', '.jpeg', '.png', '.gif', '.psd', '.bmp', '.webp', '.mp4', '.mkv', '.avi', '.mov', 'mpeg', 'mpg',
'.rar', '.zip']
check_files = []

#自行修改目录列表
work_dir_list = [r'/volume2/111', r'/volume1/222']

def save_md5_file(files_dict:dict):
if files_dict is None:
return
try:
with open("md5.txt", "w") as f:
for path_md5, file_md5, in files_dict.items():
f.write(str(path_md5) + "=" + str(path_md5) + 'n')

except Exception as e:
pass

def open_md5_file():
files_md5 = {}
try:
with open("md5.txt", "r") as f:
for md5_line in iter(lambda: f.readline(), ""):
list_keys = md5_line.split('=')
if len(list_keys) == 2:
files_md5[list_keys[0].strip()] = list_keys[1].strip()
except Exception as e:
pass

return files_md5

def remove_repeat_files():
for work_dir in work_dir_list:
for root, dirs, files in os.walk(work_dir):
for name in files:
p_type = os.path.splitext(os.path.join(root, name))[1]
if p_type in file_type:
check_files.append(os.path.join(root, name))
for name in dirs:
p_type = os.path.splitext(os.path.join(root, name))[1]
if p_type in file_type:
check_files.append(os.path.join(root, name))
files_dict = {}
files_md5 = open_md5_file()
r_index = 0
print('Files Num:%s' % len(check_files))

for file_path in check_files:
try:
md5_path = hashlib.md5()
md5_path.update(file_path.encode('utf-8'))
path_md5 = md5_path.hexdigest()
file_md5 = files_md5.get(path_md5)
if file_md5 is None:
md5_hash = hashlib.md5()
with open(file_path, "rb+") as f:
for byte_block in iter(lambda: f.read(4096), b""):
md5_hash.update(byte_block)
file_md5 = md5_hash.hexdigest()
print('Check file MD5:%s' % file_path)
files_md5[path_md5] = file_md5

if files_dict.get(file_md5) is None:
files_dict[file_md5] = file_path
else:
d_path = files_dict[file_md5]
d_path_stats = os.stat(d_path)
file_stats = os.stat(file_path)
d_time = d_path_stats.st_ctime
f_time = file_stats.st_ctime
if d_time > f_time:
os.remove(d_path)
files_dict[file_md5] = file_path
print('Delete File:', d_path)
r_index += 1
else:
os.remove(file_path)
print('Delete File:', file_path)
r_index += 1
except Exception as e:
pass

print('File Count:%s, Repeat Files Num:%s. All deleted!' %( len(check_files),str(r_index)))
save_md5_file(files_md5)

if __name__ == '__main__':
remove_repeat_files()

可以在ssh或者任务计划里执行

","gnid":"9a931522e9730c14b","img_data":[{"flag":2,"img":[{"desc":"","height":"385","title":"","url":"https://p0.ssl.img.360kuai.com/t01a2508b2adc68479c.jpg","width":"600"}]}],"original":0,"pat":"art_src_1,fts0,sts0","powerby":"cache","pub_time":1679316661000,"pure":"","rawurl":"http://zm.news.so.com/1715a84bea2900132874605fea6f9a81","redirect":0,"rptid":"71260418b6e0ce01","rss_ext":[],"s":"t","src":"什么值得买","tag":[],"title":"利用python删除群晖重复文件（缓存文件MD5方式）

杨霞彩3479用python读取文本文件,对读出的每一行进行操作,这个怎么写? -
谈可要15363523643 ______ 用python读取文本文件,对读出的每一行进行操作,写法如下: f = open(＂test.txt＂, ＂r＂) while True: line = f.readline() if line: pass # do something here line=line.strip() p=line.rfind('.') filename=line[0:p] print ＂create %s＂%line else: break ...

杨霞彩3479如何使用Python修改文本文件 -
谈可要15363523643 ______ 如何使用Python修改文本文件:for line in fileinput.input(＂filepath＂, inplace=1): line = line.replace(＂oldtext＂, ＂newtext＂) print line,

杨霞彩3479怎样在python中处理txt文档中的数据 -
谈可要15363523643 ______ 用一个dict结构,存储所有的数据读入“先婚厚爱莫萦 0” 这一行时,在dict中添加:key - “先婚厚爱莫萦”,value - '0'.读入＂先婚厚爱莫萦 1＂时,发现dict中已经有了“先婚厚爱莫萦”这个key,而且对应value是0,将其改为1.所有数据读完以后,遍历dict输出.

杨霞彩3479python使用正则表达式替换txt内容 -
谈可要15363523643 ______ s = 'kfhakl,＂dasf,fwg,gs,fatg,ta,＂,fasf,aga,wr,ga,czxv,＂fsafa,rqr,cacv,＂,dasc' l = [] quoted = False for ch in s: if ch == '＂': quoted = not quoted elif ch == ',' and quoted: l.append(',') continue l.append(ch) s = ''.join(l) print(s)

杨霞彩3479求助:用Python编写脚本,把文件名替换文件内容的首行. 例如文件名123.txt,那么修改该文件首行为123.txt -
谈可要15363523643 ______ filename = ＂123.txt＂ lines = open(filename, ＂rb＂).readlines() lines[0] = filename open(filename, ＂wb＂).writelines(lines)

杨霞彩3479Python处理未知编码的txt文件 -
谈可要15363523643 ______ 可以先判断下编码再做转码处理 import chardet def CheckCode(filename): adchar=chardet.detect(filename) if adchar['encoding']=='utf-8': filename=filename.decode('utf-8') else: filename=filename.decode('gbk') return filename

杨霞彩3479如何用python批量改文件名 -
谈可要15363523643 ______ import os oldname=['a.txt','b.txt'] newname=['aa.txt','bb.txt'] for old,new in zip(oldname,newname): os.rename(old,new)

杨霞彩3479怎样用python写代码生成一个txt文件 -
谈可要15363523643 ______ 1 2 with open('text.txt','w') as text: text.write('hello')

杨霞彩3479如何用python从文中获取文件名再用正则表达式批量修改文件名 -
谈可要15363523643 ______ 第零步:问题的提出我在网上购买了星火英语的六级晨读美文100篇(六级早已高分飘过,不过很喜欢这些文章,买来重新品味),但是发现其文章的命名都为01.txt或10.txt等.为了便于检索需要修改文件名称.第一步:从文件中取出文件名....

杨霞彩3479notepad++打开txt文件默认python语法高亮怎么设置 -
谈可要15363523643 ______ 设置 -> 语言格式设置 -> 在“语言”树中选中python -> 自定义格式中增加 “txt”-> 保存并关闭见图:这样以后用notepad++打开txt文件时,会按python语言进行语法高亮我使用的是notepad++ 6.8版本,供参考

（编辑：自媒体）