Python Tutorial: How to split and merge large files using Python?

WBOY
Release: 2023-04-22 11:43:08
forward
1886 people have browsed it

Sometimes, we need to send a large file to others, but due to the limitations of the transmission channel, such as the limit on the size of email attachments, or the network condition is not very good, we need to split the large file into small files and send them multiple times. , the receiving end then merges these small files. Today I will share how to split and merge large files using Python.

Ideas and Implementation

If it is a text file, it can be divided by the number of lines. Whether it is a text file or a binary file, it can be split according to the specified size.

Using Python's file reading and writing function, you can split and merge files, set the size of each file, and then read bytes of the specified size and write them into a new file. The receiving end reads the small files in sequence. File, write the read bytes into a file in order, and then the merge can be completed.

Split

size = 1024 * 1000 * 10# 10MB with open("bigfile", "rb") as reader: part = 1 while True: part_content = reader.read(size) if not part_content: print("split done.") break with open(f"bigfile_part{part}","wb") as writer: writer.write(part_content)
Copy after login

Merge

total_parts = 5 with open("bigfile","wb") as writer: for i in range(5): with open(f"bigfile_part{i}", "rb") as reader: writer.write(reader.read())
Copy after login

Use a third-party library

Although you can write it yourself, but Someone else has written it, why not save some time and use it directly? Just install it directly with pip:

pip install filesplit
Copy after login

Split

from filesplit.split import Split split = Split("./data.rar", "./output") split.bysize(size = 1024*1000*10) # 每个文件最多 10MB
Copy after login

After execution, we can see the split files in the output folder:

一文教会你如何用 Python 分割合并大文件

You can also split according to the number of file lines:

split.bylinecount(linecount = 10000) # 每个文件最多 10000 行
Copy after login

Merge

Merge requires small files in the folder To merge, the tool requires that there must be a manifest file in the folder. Its format is as follows:

filename,filesize,header data_1.rar,10000000,False data_2.rar,10000000,False data_3.rar,10000000,False data_4.rar,10000000,False data_5.rar,1304145,False
Copy after login

The code to merge the files only needs to specify the directory to be merged, the target directory, and the merged file name. The code is as follows:

from filesplit.merge import Merge merge = Merge(inputdir = "./output", outputdir="./merge", outputfilename = "merged.rar") merge.merge()
Copy after login

After execution, you can see the merged file in the merge directory:

一文教会你如何用 Python 分割合并大文件

The above is the detailed content of Python Tutorial: How to split and merge large files using Python?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!