2018年10月8日11:58:43评论796
Learning Pandas content validity
If you are a Python programmer who wants to get started with performing data analysis using pandas and Python, this is the book for you. Some experience with statistical analysis would be helpful but is not mandatory.
This learner’s guide will help you understand how to use the features of pandas for interactive data manipulation and analysis.
This book is your ideal guide to learning about pandas, all the way from installing it to creating one- and two-dimensional indexed data structures, indexing and slicing-and-dicing that data to derive results, loading data from local and Internet-based resources, and finally creating effective visualizations to form quick insights. You start with an overview of pandas and NumPy and then dive into the details of pandas, covering pandas’ Series and DataFrame objects, before ending with a quick review of using pandas for several problems in finance.
With the knowledge you gain from this book, you will be able to quickly begin your journey into the exciting world of data science and analysis.
Learning Pandas Catalog
Learning pandas
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. A Tour of pandas
pandas and why it is important
pandas and IPython Notebooks
Referencing pandas in the application
Primary pandas objects
The pandas Series object
The pandas DataFrame object
Loading data from files and the Web
Loading CSV data from files
Loading data from the Web
Simplicity of visualization of pandas data
Summary
2. Installing pandas
Getting Anaconda
Installing Anaconda
Installing Anaconda on Linux
Installing Anaconda on Mac OS X
Installing Anaconda on Windows
Ensuring pandas is up to date
Running a small pandas sample in IPython
Starting the IPython Notebook server
Installing and running IPython Notebooks
Using Wakari for pandas
Summary
3. NumPy for pandas
Installing and importing NumPy
Benefits and characteristics of NumPy arrays
Creating NumPy arrays and performing basic array operations
Selecting array elements
Logical operations on arrays
Slicing arrays
Reshaping arrays
Combining arrays
Splitting arrays
Useful numerical methods of NumPy arrays
Summary
4. The pandas Series Object
The Series object
Importing pandas
Creating Series
Size, shape, uniqueness, and counts of values
Peeking at data with heads, tails, and take
Looking up values in Series
Alignment via index labels
Arithmetic operations
The special case of Not-A-Number (NaN)
Boolean selection
Reindexing a Series
Modifying a Series in-place
Slicing a Series
Summary
5. The pandas DataFrame Object
Creating DataFrame from scratch
Example data
S&P 500
Monthly stock historical prices
Selecting columns of a DataFrame
Selecting rows and values of a DataFrame using the index
Slicing using the [] operator
Selecting rows by index label and location: .loc[] and .iloc[]
Selecting rows by index label and/or location: .ix[]
Scalar lookup by label or location using .at[] and .iat[]
Selecting rows of a DataFrame by Boolean selection
Modifying the structure and content of DataFrame
Renaming columns
Adding and inserting columns
Replacing the contents of a column
Deleting columns in a DataFrame
Adding rows to a DataFrame
Appending rows with .append()
Concatenating DataFrame objects with pd.concat()
Summarized data and descriptive statistics
Summary
6. Accessing Data
Setting up the IPython notebook
CSV and Text/Tabular format
The sample CSV data set
Reading a CSV file into a DataFrame
Specifying the index column when reading a CSV file
Data type inference and specification
Specifying column names
Specifying specific columns to load
Saving DataFrame to a CSV file
General field-delimited data
Handling noise rows in field-delimited data
Reading and writing data in an Excel format
Reading and writing JSON files
Reading HTML data from the Web
Reading and writing HDF5 format files
Accessing data on the web and in the cloud
Reading and writing from/to SQL databases
Reading data from remote data services
Reading stock data from Yahoo! and Google Finance
Adding rows (and columns) via setting with enlargement
Removing rows from a DataFrame
Removing rows using .drop()
Removing rows using Boolean selection
Removing rows using a slice
Changing scalar values in a DataFrame
Arithmetic on a DataFrame
Resetting and reindexing
Hierarchical indexing
Retrieving data from Yahoo! Finance Options
Reading economic data from the Federal Reserve Bank of St. Louis
Accessing Kenneth French’s data
Reading from the World Bank
Summary
7. Tidying Up Your Data
What is tidying your data?
Setting up the IPython notebook
Working with missing data
Determining NaN values in Series and DataFrame objects
Selecting out or dropping missing data
How pandas handles NaN values in mathematical operations
Filling in missing data
Forward and backward filling of missing values
Filling using index labels
Interpolation of missing values
Handling duplicate data
Transforming Data
Mapping
Replacing values
Applying functions to transform data
Summary
8. Combining and Reshaping Data
Setting up the IPython notebook
Concatenating data
Merging and joining data
An overview of merges
Specifying the join semantics of a merge operation
Pivoting
Stacking and unstacking
Stacking using nonhierarchical indexes
Unstacking using hierarchical indexes
Melting
Performance benefits of stacked data
Summary
9. Grouping and Aggregating Data
Setting up the IPython notebook
The split, apply, and combine (SAC) pattern
Split
Data for the examples
Grouping by a single column’s values
Accessing the results of grouping
Grouping using index levels
Apply
Applying aggregation functions to groups
The transformation of group data
An overview of transformation
Practical examples of transformation
Filtering groups
Discretization and Binning
Summary
10. Time-series Data
Setting up the IPython notebook
Representation of dates, time, and intervals
The datetime, day, and time objects
Timestamp objects
Timedelta
Introducing time-series data
DatetimeIndex
Creating time-series data with specific frequencies
Calculating new dates using offsets
Date offsets
Anchored offsets
Representing durations of time using Period objects
The Period object
PeriodIndex
Handling holidays using calendars
Normalizing timestamps using time zones
Manipulating time-series data
Shifting and lagging
Frequency conversion
Up and down resampling
Time-series moving-window operations
Summary
11. Visualization
Setting up the IPython notebook
Plotting basics with pandas
Creating time-series charts with .plot()
Adorning and styling your time-series plot
Adding a title and changing axes labels
Specifying the legend content and position
Specifying line colors, styles, thickness, and markers
Specifying tick mark locations and tick labels
Formatting axes tick date labels using formatters
Common plots used in statistical analyses
Bar plots
Histograms
Box and whisker charts
Area plots
Scatter plots
Density plot
The scatter plot matrix
Heatmaps
Multiple plots in a single chart
Summary
12. Applications to Finance
Setting up the IPython notebook
Obtaining and organizing stock data from Yahoo!
Plotting time-series prices
Plotting volume-series data
Calculating the simple daily percentage change
Calculating simple daily cumulative returns
Resampling data from daily to monthly returns
Analyzing distribution of returns
Performing a moving-average calculation
The comparison of average daily returns across stocks
The correlation of stocks based on the daily percentage change of the closing price
Volatility calculation
Determining risk relative to expected returns
Summary
Index
Learning Pandas Wonderful Digest
pandas and IPython Notebooks
A popular means of using pandas is through the use of IPython Notebooks. IPython Notebooks provide a web-based interactive computational environment, allowing the combination of code, text, mathematics, plots, and right media into a web-based document. IPython Notebooks run in a browser and contain Python code that is run in a local or server-side Python session that the notebooks communicate with using WebSockets. Notebooks can also contain markup code and rich media content, and can be converted to other formats such as PDF, HTML, and slide shows.
本文来自清杉投稿,不代表电子书资源网立场,如若转载,请联系原作者获取。