python+strip+方法

首页 >> 正文

python+strip+方法

来源：baiyundou.net 日期：2024-08-22

作者：siseniao

从该文章改进：https://post.smzdm.com/p/an370emp/?zdm_ss=Android_1106136211_&send_by=1106136211&from=other&invite_code=zdmwffzv7winv

原文每次扫描都需要重新计算MD5，对于大文件来说，磁盘消耗较大，增加了缓存文件存储md5，每次扫描只计算新文件，提高效率。

不废话，直接贴代码：

import os

import hashlib

# 只删除以下列表中的重复文件类型.如果想删除其他类型的文件,自己添加一下就行了

file_type = ['.jpg', '.jpeg', '.png', '.gif', '.psd', '.bmp', '.webp', '.mp4', '.mkv', '.avi', '.mov', 'mpeg', 'mpg',
'.rar', '.zip']
check_files = []

#自行修改目录列表
work_dir_list = [r'/volume2/111', r'/volume1/222']

def save_md5_file(files_dict:dict):
if files_dict is None:
return
try:
with open("md5.txt", "w") as f:
for path_md5, file_md5, in files_dict.items():
f.write(str(path_md5) + "=" + str(path_md5) + 'n')

except Exception as e:
pass

def open_md5_file():
files_md5 = {}
try:
with open("md5.txt", "r") as f:
for md5_line in iter(lambda: f.readline(), ""):
list_keys = md5_line.split('=')
if len(list_keys) == 2:
files_md5[list_keys[0].strip()] = list_keys[1].strip()
except Exception as e:
pass

return files_md5

def remove_repeat_files():
for work_dir in work_dir_list:
for root, dirs, files in os.walk(work_dir):
for name in files:
p_type = os.path.splitext(os.path.join(root, name))[1]
if p_type in file_type:
check_files.append(os.path.join(root, name))
for name in dirs:
p_type = os.path.splitext(os.path.join(root, name))[1]
if p_type in file_type:
check_files.append(os.path.join(root, name))
files_dict = {}
files_md5 = open_md5_file()
r_index = 0
print('Files Num:%s' % len(check_files))

for file_path in check_files:
try:
md5_path = hashlib.md5()
md5_path.update(file_path.encode('utf-8'))
path_md5 = md5_path.hexdigest()
file_md5 = files_md5.get(path_md5)
if file_md5 is None:
md5_hash = hashlib.md5()
with open(file_path, "rb+") as f:
for byte_block in iter(lambda: f.read(4096), b""):
md5_hash.update(byte_block)
file_md5 = md5_hash.hexdigest()
print('Check file MD5:%s' % file_path)
files_md5[path_md5] = file_md5

if files_dict.get(file_md5) is None:
files_dict[file_md5] = file_path
else:
d_path = files_dict[file_md5]
d_path_stats = os.stat(d_path)
file_stats = os.stat(file_path)
d_time = d_path_stats.st_ctime
f_time = file_stats.st_ctime
if d_time > f_time:
os.remove(d_path)
files_dict[file_md5] = file_path
print('Delete File:', d_path)
r_index += 1
else:
os.remove(file_path)
print('Delete File:', file_path)
r_index += 1
except Exception as e:
pass

print('File Count:%s, Repeat Files Num:%s. All deleted!' %( len(check_files),str(r_index)))
save_md5_file(files_md5)

if __name__ == '__main__':
remove_repeat_files()

可以在ssh或者任务计划里执行

","gnid":"9a931522e9730c14b","img_data":[{"flag":2,"img":[{"desc":"","height":"385","title":"","url":"https://p0.ssl.img.360kuai.com/t01a2508b2adc68479c.jpg","width":"600"}]}],"original":0,"pat":"art_src_1,fts0,sts0","powerby":"hbase","pub_time":1679316661000,"pure":"","rawurl":"http://zm.news.so.com/1715a84bea2900132874605fea6f9a81","redirect":0,"rptid":"71260418b6e0ce01","rss_ext":[],"s":"t","src":"什么值得买","tag":[],"title":"利用python删除群晖重复文件（缓存文件MD5方式）

满治质2308python strip 的一个问题,求解答.谢谢 -
虞废贾13248829915 ______ 因为s在中间,s.strip('sa y')会剔除开头和结尾包含的 ' ','a','s','y'字符,直到遇到非这几个字符的字符停止, 前面遇到'e'停止,后面遇到'o'停止, 结果就是es no

满治质2308python 怎么读取一个字符串 -
虞废贾13248829915 ______ python是一款应用非常广泛的脚本程序语言,谷歌公司的网页就是用python编写.python在生物信息、统计、网页制作、计算等多个领域都体现出了强大的功能.python和其他脚本语言如java、R、Perl 一样,都可以直接在命令行里运行脚本程序...

满治质2308python查询排序问题,里面一行代码不明白 str.strip 是啥意思 -
虞废贾13248829915 ______ str为字符串,rm为要删除的字符序列 str.strip(rm) 删除s字符串中开头、结尾处,位于 rm删除序列的字符 str.lstrip(rm) 删除s字符串中开头处,位于 rm删除序列的字符 str.rstrip(rm) 删除s字符串中结尾处,位于 rm删除序列的字符你这里使用其实就是把字符串中的处理函数 strip当做一个参数传给了 map

满治质2308python用了strip没用 s=open('z.txt') t=read().strip() -
虞废贾13248829915 ______ f = open('z.txt', 'r') for line in f.readlines(): print line.strip()

满治质2308利用python对外部程序进行操作 -
虞废贾13248829915 ______ 代码如下复制代码 object_id_list=[1, 3, 88, 99] f=open('mylist', “w”) for id in object_id_list: f.writelines(str(id)) f.close() #只有输入这一句之后才会真正写入到文件中 cat mylist138899% # 最后有一个%表示没有换行>>> object_id_list=[1, 3, ...

满治质2308Python 如何使用一行代码读取全部内容出来(.txt文件,读取每行内容) -
虞废贾13248829915 ______ 使用readlines函数即可,完整代码是: text = open(file,'r').readlines()

满治质2308python3 如何中将两个文件按行合并 -
虞废贾13248829915 ______ python test.py --input1 dat1.txt --input2 dat2.txt > 2.out.txt 复制代码#!/usr/bin/env python# -*- coding: utf-8 -*- __author__ = 'shengwei ma' __author_email__ = '[email protected]' import sys import getopt input_file1 = ＂＂ input_file2 = ＂＂ ...

满治质2308怎么用python写一个聊天室 -
虞废贾13248829915 ______ 1.服务器类首先需要一个聊天服务器,这里继承asyncore的dispatcher类来实现,代码如下 class ChatServer(dispatcher): ＂＂＂聊天服务器＂＂＂ def __init__(self, port): dispatcher.__init__(self) self.create_socket(socket.AF_INET, socket....

满治质2308java什么函数跟python里的strip作用一样 -
虞废贾13248829915 ______ 我记得是去掉文本中句子开头与结尾的符号的.因为在文本中每行开头都有个＂/n＂,代表换行.(印象中是这样的) 一般是 for line in file: line.strip().split() 这样就把每行的每个字符一个个分开,变成一个list. 好久没碰了,你可以试试

（编辑：自媒体）