Converting Column to DateTime in Pandas

Time series data are frequently encountered when working with data in Pandas, and we are aware that Pandas is an excellent tool for working with time-series data in Python. Using the to_datetime() and astype() functions in Pandas, you can convert a column (of a text, object, or integer type) to a datetime. Furthermore, if you’re reading data from an external source like CSV or Excel, you can specify the data type (for instance, datetime).

This Pandas article will teach you how to change a column of dates in string format into a datetime format. The to_datetime() method will convert things like strings to datetime in the beginning. A neat feature is working with the format argument when using to_datetime(). In other words, we’ll also look at how to convert in the right format. The astype() method will then be used to complete this conversion process.

The syntax for the two methods is as follows:

df['Date'] = pd.to_datetime(df['Date'])
df['Date'] = df['Date'].astype('datetime64[ns]')

Whereas the first method in the syntax above can handle both a string and integers, the second method only deals with strings.

Let’s look at converting a string column in a dataframe (which is now in dd/mm/yyyy format) to datetime format. Incorrect date formats prevent us from doing any time series-based operations on the dates. We must change the dates into the datetime format to use it effectively.

Using the pd.to_datetime() function to convert the Pandas dataframe column from a string to a datetime format

# importing pandas as pd
import pandas as pd

# Creation of a dataframe
sport_df = pd.DataFrame({'Date':['10/25/2022', 10/23/2022', '10/29/2022'],
				'Sport':['Soccer', 'Tennis', 'Rugby'],
				'participants':[50000, 25000, 85000]})

# Print the dataframe
print(sport_df)

# We will now examine the 'Date' column's data type
sport_df.info()

As seen in the output, the "Date" column's data type is the object or string. Using the pd.to_datetime() function, we will now convert it to datetime format.

# convert the 'Date' column to datetime format
sport_df['Date']= pd.to_datetime(sport_df['Date'])

#Check the 'Date' column's formatting
sport_df.info()

The format of the “Date” column has been changed to the datetime format, as seen in the output.

Use the DataFrame.astype() function to change the column type of a Pandas dataframe from a string to a datetime format

# importing pandas as pd
import pandas as pd

# Creation of a dataframe
sport_df = pd.DataFrame({'Date':['10/25/2022', 10/23/2022', '10/29/2022'],
				'Sport':['Soccer', 'Tennis', 'Rugby'],
				'participants':[50000, 25000, 85000]})

# Printing of the provided dataframe
print(sport_df)

# We will now examine the 'Date' column's data type.
sport_df.info()

As seen in the output, the "Date" column's data type is the object or string. Now, we will use the DataFrame.astype() function to convert it to datetime format.

# conversion of the 'Date' column to a corresponding datetime format
sport_df['Date'] = sport_df['Date'].astype('datetime64[ns]')

# verification of the format of the 'Date' column's contents
sport_df.info()

The format of the “Date” column has been changed to the datetime format, as seen in the output.

Converting a data frame column from the ‘yymmdd’ format to the ‘yyyymmdd’ format

# importing pandas library
import pandas as pd

# Data set initialization for the nested list
dance_list = [['200712',50000],['200714',51000],['200716',51500],
			['200719',53000],['200721',54000],
			['200724',55000],['200729',57000]]

# creation of a pandas dataframe
pl_df = pd.DataFrame(dance_list,columns=['Dates','Dancers'])

# show the dataframe by printing
print(pl_df)
print()

# verify the type
print(pl_df.dtypes)


# conversion of the string to a corresponding datetime format
pl_df['Dates'] = pd.to_datetime(pl_df['Dates'], format='%y%m%d')

# show the dataframe by printing
print(pl_df)
print()

print(pl_df.dtypes)

In the example above, we change the format of the column “Dates” from “yymmdd” to “yyyymmdd” and the data type from “object” to “datetime64[ns]”.

Using pandas.to_datetime(), convert several columns from string to “yyyymmdd” format

# importing pandas library
import pandas as pd

# Initialization of the data set whose nature is a nested list
dancers_list = [['20200712',50000,'20200812'],
			['20200714',51000,'20200814'],
			['20200716',51500,'20200816'],
			['20200719',53000,'20200819'],
			['20200721',54000,'20200821'],
			['20200724',55000,'20200824'],
			['20200729',57000,'20200824']]

# creating a pandas dataframe
pl_df = pd.DataFrame(
player_list,columns = ['Dance_start', 'Dancers','Dance_end'])

#show the dataframe by printing
print(pl_df)
print()

# verification the type
print(pl_df.dtypes)

Example: converting a string to datetime format in multiple columns

pl_df['Dance_start'] = pd.to_datetime(
						df['Dance_start'],
						format='%Y%m%d'
)
pl_df['Dance_end'] = pd.to_datetime(
						df['Dance_end'],
						format='%Y%m%d'
)


# show the dataframe by printing
print(pl_df)
print()

print(pl_df.dtypes)

Change the columns “Dance_start” and “Dance_end” in the example above from “object” to “datetime64[ns]” type.

Example: Object to String Conversion Column

This section will examine how to change a column to a datetime format.

Observe how our dictionary has grown to include strings that include datetime (i.e., the values returned when the ‘dob’ entry is used). Therefore, the next action is to create a dataframe. We import pandas first, then utilize the dictionary as input and the pd.DataFrame class:

import pandas as pd

employee = {'net_pay':[13000, 23222, 33321, 43414, 53151],
       'f_name': ['Mike', 'Angelo', 'White', 'Green', 'Joy'],
       'no_of_shares':[5, 3, 7.17, 11.3, 33.15],
       'dob':['11/24/2020', '12/21/2019', '10/14/2018', '12/13/2017', '01/08/2017'],
       'reg_date': [20120902, 20130413, 20140921, 20140321, 20140321]}

print(employee)



df = pd.DataFrame(employee)
df.head()

# conversion of a column to datetime pandas
df['dob'] = pd.to_datetime(df['dob'])

We used the brackets and dataframe (df) in the previous code snippet. Additionally, we added a string containing the column we wished to convert between the brackets. Note that you can make this a new column by simply changing the word “dob” to the word “Datetime.” That would increase the dataframe’s column count.
We used the to_datetime() method to place the equal sign (“=”) on the right side. We utilize the column that we intended to convert because we aren’t working with any formatting. Here is the column in datetime format:

Example: How to Change an Integer Column

Here’s how to change an integer in a dataframe to a datetime:

import pandas as pd

employee = {'net_pay':[13000, 23222, 33321, 43414, 53151],
       'f_name': ['Mike', 'Angelo', 'White', 'Green', 'Joy'],
       'no_of_shares':[5, 3, 7.17, 11.3, 33.15],
       'dob':['11/24/2020', '12/21/2019', '10/14/2018', '12/13/2017', '01/08/2017'],
       'reg_date': [20120902, 20130413, 20140921, 20140321, 20140321]}

print(employee)

df = pd.DataFrame(employee)

# converting a column of numbers to date and time in pandas
df['reg_date'] = pd.to_datetime(df['reg_date'])

Please note that you may also use the format parameter if your dates are formatted differently than in these instances. Consult the documentation for more details on the to_datetime() method. You may now use Pandas value_count() method to count occurrences in a column after changing the data type in the dataframe.

Column to Date/Time Conversion when Reading an Excel File

When using Pandas read_excel to import data, follow these steps to convert a column to a datetime value:

df = pd.read_excel('pandas_convert_column_to_datetime.xlsx',
                 index_col=0, parse_dates=True)
df.info()

As you can see, we utilized the same option (parse date) as you would when reading a CSV file. Please note that the Excel file has set the date column as indexes. Change True to [0] to make this into a column.

You can now begin using other date-related methods, such TimeDelta, after converting your dates to datetime objects. Further, You can now determine the difference between two dates, for instance. You can also use a time-series plot, such as the Seaborn line plot, to plot your data. As the last point, both approaches allow you to add each index of the columns you need to convert to datetime if numerous columns need to be done.

Using DataFrame.astype() in converting the Pandas Dataframe’s Column Type From String to Datetime

Using the pandas DataFrame.astype() function is another way to change the datatype of the dataframe column to datetime.

#start by importing the panda's library as pd
import pandas as pd

#Creation of the dataframe
games = pd.DataFrame({'date':['10/29/2022','10/27/2022','10/28/2022'],
			'sport':['soccer','Tennis','Handball'],
			'charges':[3200,1700,2300]
			
# printing of the dataframe
print(games)

#we now have the task of checking the 'date' column's data type
print(games.info())
})

To build this illustration, we used the dataframe that we created in the previous example. We then saved it in the dataframe object using the same three columns from the dataframe and used the print() method to display it.

While the “charges” column has an int datatype, the “date” and “sport” columns both have object datatypes. Since we are changing the column to datetime datatype, the datatype conversion will be done on the “date” column.

# conversion of the 'date' column to datetime format
Entry['date'] =games['date'].astype('datetime64[ns]')

# check the format of the 'date' column
print(games.info())

Here, we’re using the panda’s function DataFrame.astype() to achieve the desired result.

We use the astype() method from the panda’s package in the sample code shown above.
The dataframe object’s name and the column you wish to convert to datetime format should be placed between large brackets following the assignment operator “=”. The name of the dataframe object and column name should also be placed with the “.astype()” function.

You must specify the necessary datatype between the parentheses of the astype() method. We have provided the “datetime64[ns])” datatype since we need to convert it to a datetime datatype. Finally, we used the info() method to display the datatypes.

Conclusion

When working with data in Pandas, time series data are frequently encountered. Python’s Pandas framework is the best choice for handling time-series data. We might need to convert the string to Datetime format with the dataset that has been provided. If a column currently saves the data as either a string or an object type, you might need to update the datetime formatting.

The datetime format allows various temporal operations to be performed on the column. You may, for instance, find the time difference between two dates and get precise information like the month, year, etc.

The pandas to_datetime() and astype() functions allow us to convert a column’s format from a String or even an Object to a Datetime. Using the datetime() method, which recognizes the format argument to define the format date & time, one can convert a DateTime into a DataFrame with a string column that is formatted in a specific fashion. When the dates have not been input correctly, the user will be unable to do any time-series-based operations on them. We must modify the dates to the required date-time format to handle this.

This article generally discusses changing a column with an object or string into a datetime datatype in a Pandas DataFrame.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *