search
HomeBackend DevelopmentPython TutorialPython calculates office hours: CSV data processing and time difference calculation

Python calculates office hours: CSV data processing and time difference calculation

This article aims to provide a Python script for reading data from a CSV file and calculating the office hours corresponding to each ID within a specific month (such as February). The script does not rely on the Pandas library, but uses the csv and datetime modules for data processing and time calculation. The article will explain the code logic in detail and provide considerations to help readers understand and apply the method.

Data preparation

First, we need to prepare a CSV file containing ID, type (in/out) and time information. Here is the contents of an example CSV file (data.csv):

 id, type, time
1,out,2023-01-01T08:01:28.000Z
1,in,2023-02-01T08:01:28.000Z
2,in,2023-02-01T09:04:16.000Z
2,out,2023-02-01T12:01:28.000Z
1,out,2023-02-01T13:34:15.000Z

Python code implementation

The following Python code shows how to read CSV files, filter data for February, and calculate the office hours for each ID.

 import datetime
import csv

date_format = '%Y-%m-%dT%H:%M:%S.%fZ'
total_time = {}
feb = datetime.datetime.strptime('2023-02', '%Y-%m').month

file_path = 'data.csv'
with open(file_path, 'r') as f:
    # Create a CSV reader
    csv_file = csv.DictReader(f)
    list_of_dict = list(csv_file)

for d in list_of_dict:
  w_id = d['id']
  dt = datetime.datetime.strptime(d['time'], date_format).date()
  d_time = datetime.datetime.strptime(d['time'], date_format)

  if d_time.month == feb:

    if not total_time.get(w_id):
      total_time[w_id] = {"date": None,"last_in": None, "last_out": None, "work_hour_s": 0. , 'work_hour_string': '' }

    update_time = total_time[w_id]
    update_time['date'] = dt
    if d['type'] == 'in':
      update_time['last_in'] = d_time
    if d['type'] == 'out':
      update_time['last_out'] = d_time

    if update_time['last_out'] and update_time['last_in']:
      if update_time['last_out'] > update_time['last_in']:
        work_hour_s = update_time['last_out'] - update_time['last_in']
        update_time['work_hour_s'] = work_hour_s.seconds

        up_time = int(update_time['work_hour_s'])
        hours, remainder = divmod(up_time, 3600)
        minutes, seconds = divmod(remainder, 60)

        formatted_duration = f"{hours:02d}:{minutes:02d}:{seconds:02d}"
        update_time['work_hour_string'] = formatted_duration

print(total_time)

Code explanation:

  1. Import the necessary modules: Import datetime is used to process date and time, and csv is used to read CSV files.
  2. Define date format: date_format = '%Y-%m-%dT%H:%M:%S.%fZ' Defines the format of timestamps in CSV files.
  3. Initialize the data structure: total_time = {} is used to store the office duration information corresponding to each ID.
  4. Specify the target month: feb = datetime.datetime.strptime('2023-02', '%Y-%m').month Gets the month value of February.
  5. Read CSV file: Use csv.DictReader to read CSV file and store each line of data as a dictionary.
  6. Traversal data: Iterate through every line of data in a CSV file.
  7. Filter the target month data: Check whether the month of the current row data is February.
  8. Initialize ID data: If the current ID does not exist in the total_time dictionary, initialize a dictionary containing keys such as date, last_in, last_out, work_hour_s, work_hour_string, etc.
  9. Update the last entry/leave time: Update the last_in or last_out time according to the value of the type field.
  10. Calculate office hours: If last_in and last_out are both present, calculate the time difference between the two and accumulate the results into work_hour_s.
  11. Format output: Convert the total number of seconds to a string in HH:MM:SS format and store it in work_hour_string.
  12. Output result: Print the total_time dictionary, which contains information on the office duration of each ID in February.

Output example:

 {'1': {'date': datetime.date(2023, 2, 1),
  'last_in': datetime.datetime(2023, 2, 1, 8, 1, 28),
  'last_out': datetime.datetime(2023, 2, 1, 13, 34, 15),
  'work_hour_s': 19967.0,
  'work_hour_string': '05:32:47'},
 '2': {'date': datetime.date(2023, 2, 1),
  'last_in': datetime.datetime(2023, 2, 1, 9, 4, 16),
  'last_out': datetime.datetime(2023, 2, 1, 12, 1, 28),
  'work_hour_s': 10632.0,
  'work_hour_string': '02:57:12'}}

Things to note

  • Data order: The code assumes that the in and out records of the same ID appear in pairs, and the out records are later than the in records. If the data is not ordered, it needs to be sorted first.
  • Data Integrity: The code does not handle data missing, such as only in records and no out records, or vice versa. In practical applications, it needs to be handled according to specific circumstances. For example, records with only in or out can be ignored, or missing in or out times can be filled with default values.
  • Date format: Ensure that the date format in the CSV file is consistent with the format defined by the date_format variable. If it is inconsistent, the date_format variable needs to be modified.
  • Time zone issue: The code does not consider the time zone issue. If the timestamp in the CSV file contains time zone information, the timestamp needs to be converted to a unified time zone before the calculation is performed.
  • Error handling: In actual applications, an error handling mechanism should be added, such as using the try-except block to catch possible exceptions, such as file not exist, date format errors, etc.

Summarize

This article provides a sample code for calculating office hours in CSV files using Python, and explains the code logic and considerations in detail. This code does not rely on the Pandas library, but uses the csv and datetime modules for data processing, which can be used as the basis for processing CSV data. In actual applications, it is necessary to modify and improve according to specific circumstances, such as handling data missing, time zone problems and error handling.

The above is the detailed content of Python calculates office hours: CSV data processing and time difference calculation. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How do you find the common elements between two or more lists in Python?How do you find the common elements between two or more lists in Python?Aug 27, 2025 am 05:27 AM

The most efficient way to find common elements of two or more lists is to use the intersection operation of the set. 1. Convert the list to a set and use the & operator or .intersection() method to find the intersection, for example, common=list(set(list1)&set(list2)); 2. For multiple lists, you can use set(list1).intersection(set(list2), set(list3)) or set.intersection(*map(set,lists)) to achieve dynamic processing; 3. Pay attention to the disordered and automatic deduplication. If you need to maintain the order, you can traverse the original list and combine the set judgment.

How to parse a URL to extract its components in Python?How to parse a URL to extract its components in Python?Aug 27, 2025 am 05:19 AM

Use urllib.parse.urlparse() to parse the URL into components such as scheme, netloc, path, query, fragment; 2. Access various parts through properties such as parsed.scheme, parsed.netloc; 3. Use parse_qs() to convert the query string into a dictionary form, and parse_qsl() to a tuple list; 4. Hostname and port can extract the host name and port number respectively; 5. Combinable functions can implement complete URL analysis, which is suitable for most URL processing scenarios, and finally return the structured result to the end.

Efficient update of JSON data: Inventory management optimization practices in Discord.py applicationsEfficient update of JSON data: Inventory management optimization practices in Discord.py applicationsAug 27, 2025 am 04:45 AM

This article aims to guide developers how to efficiently update JSON data, especially in the Discord.py application and other scenarios. By analyzing common inefficient file operation modes, an optimization solution is proposed and demonstrated: load JSON data into memory at one time, and after all modifications are completed, the updated data is written back to the file at once, thereby significantly improving performance and ensuring data consistency.

How does inheritance work in PythonHow does inheritance work in PythonAug 27, 2025 am 03:14 AM

InheritanceinPythonallowsaclasstoinheritattributesandmethodsfromanotherclass,promotingcodereuseandestablishingahierarchy;thesubclassinheritsfromthesuperclassusingthesyntaxclassChild(Parent):,gainingaccesstoitsmethodslikegreet()whileoptionallyoverridi

Python calculates office hours: CSV data processing and time difference calculationPython calculates office hours: CSV data processing and time difference calculationAug 26, 2025 pm 04:45 PM

This article aims to provide a Python script for reading data from a CSV file and calculating the office hours corresponding to each ID within a specific month (such as February). The script does not rely on the Pandas library, but uses the csv and datetime modules for data processing and time calculation. The article will explain the code logic in detail and provide considerations to help readers understand and apply the method.

Solve the problem of SSL certificate verification failure during PyTerrier initializationSolve the problem of SSL certificate verification failure during PyTerrier initializationAug 26, 2025 pm 04:42 PM

When initializing using PyTerrier, users may encounter a ssl.SSLCertVerificationError error, prompting certificate verification failed. This is usually caused by the system's inability to obtain or verify the local issuer certificate. This article will explain the causes of this problem in detail and provide a way to quickly resolve the problem by temporarily disabling SSL certificate verification, while highlighting its potential security risks and applicable scenarios.

Python list numerical cropping: a practical guide to limiting the range of numerical valuesPython list numerical cropping: a practical guide to limiting the range of numerical valuesAug 26, 2025 pm 04:36 PM

This article describes how to use Python to crop a value in a list so that it falls within a specified upper and lower limit range. We will explore two implementation methods: one is an intuitive method based on loops, and the other is a concise method that uses min and max functions. Help readers understand and master numerical cropping techniques with code examples and detailed explanations, and avoid common mistakes.

Solve the problem that LabelEncoder cannot recognize previously 'seen' tagsSolve the problem that LabelEncoder cannot recognize previously 'seen' tagsAug 26, 2025 pm 04:33 PM

This article aims to resolve the "y contains previously unseen labels" error encountered when encoding data using LabelEncoder. This error usually occurs when there are different category tags in the training set and the test set (or validation set). This article will explain the causes of the error in detail and provide the correct encoding method to ensure that the model can handle all categories correctly.

See all articles

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

Hot Topics