What are the commonly used operations in a pandas dataframe?
Common operations used with pandas DataFrames include:
- You can create a DataFrame by using a list, dictionary, NumPy array, or CSV file.
- Accessing data: Data in a DataFrame can be accessed using slicing, indexing, labels, or conditional filtering.
- Viewing data: You can use the head() and tail() methods to see the first few rows or last few rows of a DataFrame.
- Descriptive statistics: You can use the describe() method to get descriptive statistics information for DataFrame columns, such as mean, standard deviation, minimum, maximum, etc.
- Data cleaning and processing: You can utilize the dropna() method to delete rows or columns containing missing values, the fillna() method to fill in missing values, and the replace() method to substitute specific values.
- Data sorting: The DataFrame can be sorted using the sort_values() method based on a specified column.
- Data grouping and aggregation: The groupby() method can be used to group data by specified columns, and aggregate functions such as sum(), mean(), count(), etc. can be used to calculate statistics on the grouped data.
- Data merging and joining: Multiple DataFrames can be merged or connected into one using methods such as concat(), merge(), and join().
- Column operations: You can use the rename() method to rename column names, the drop() method to delete columns, the astype() method to change data types, and the apply() method to apply custom functions to columns.
- Data visualization: Data from DataFrame can be visualized using libraries such as matplotlib and seaborn.
These operations are just a small part of pandas DataFrame, which also offers many other functionalities and methods that can be utilized based on specific needs.