Skip to content Skip to sidebar Skip to footer

Manipulate Time-range In A Pandas Dataframe

Need to clean up a csv import, which gives me a range of times (in string form). Code is at bottom; I currently use regular expressions and replace() on the df to convert other cha

Solution 1:

Here is the converter function that you need based on your requested input data. convert_entry takes complete value entry, splits it on a dash, and passes its result to convert_single, since both halfs of one entry can be converted individually. After each conversion, it merges them with a dash.

convert_single uses regex to search for important parts in the time string. It starts with a some numbers \d+ (representing the hours), then optionally a dot or a colon and some more number [.:]?(\d+)? (representing the minutes). And after that optionally AM or PM (AM|PM)? (only PM is relevant in this case)

import re


def convert_single(s):
    m = re.search(pattern="(\d+)[.:]?(\d+)?(AM|PM)?", string=s)
    hours = m.group(1)
    minutes = m.group(2) or "00"
    if m.group(3) == "PM":
        hours = str(int(hours) + 12)
    return hours.zfill(2) + ":" + minutes.zfill(2)


def convert_entry(value):
    start, end = value.split("-")
    start = convert_single(start)
    end = convert_single(end)
    return "-".join((start, end))


values = ["15-18", "18.30-19.00", "4PM-5PM", "3-4", "4-4.10PM", "15 - 17", "11 - 13"]

for value in values:
    cvalue = convert_entry(value)
    print(cvalue)

Post a Comment for "Manipulate Time-range In A Pandas Dataframe"