Posts

CST 383 Week 4

Image
This was our fourth week in CST 383 - Data Science.  Reflection This week we went into more detail on crosstabs, how to manipulate them, and how to plot them. We also briefly covered how to plot more than two variables on a scatter plot and Violin Plots and Seaborn Facet Grids. After the second week, which introduced crosstabs but did not cover them in detail, I was wondering when we were going to cover them in more detail. At first crosstabs seemed like a unique data structure, but after learning about them and manipulating them in code they feel like any other Dataframe or 2D structure. Speaking of 2D structures, at the beginning of this course I found manipulating 2D arrays and Dataframes to be quite complex. I often had problems envisioning the data that I was working on. After having worked on 2D structures for the past few weeks, I am happy to find that it now feels easier. While the syntax for crosstabs is more complicated, I was able to intuit the data easier than I was abl...

CST 383 Week 3

Image
Reflection This was our third week in CST383 - Data Science. This week we learned about visualizating data and graphs using  pandas , matplotlib , and scpipy . We focused on definitions of various types of graphs, functions to create them, and various parameters we can use to alter them. Like with the array functions and operations from week one, I was surprised at how robust these libraries are, and also by how easy their basic uses are to learn. In very little time, I was able to start making nice looking graphs, and with more research I could quickly make professinal quality ones. In previous weeks, I found half of the material to be relatively straightforward (namely 1D arrays and Series), and the rest of it to be more difficult (2D arrays and Data Frames). This week felt a little different, with all of the different types of graphs and parameters feeling about as advanced an complicated as each other. If one thing felt more difficult, it would be using the plt.subplot...

CST 383 Week 2

Image
This is our second week in CST 383 - Introduction to Data Science. Pandas This week we discussed Pandas , which is a Python library that is built on top of NumPy and is designed for data analysis. Series  First we discussed Pandas Series, which is analogous to a list or 1D array.  Series are made up of two 1D arrays. One array contains values, and the other contains indices for the data. The two arrays for values and indices can contain different data types. To create a series we use  pd.Series , such as  x = pd.Series([.2, .4, .3, 1.0], index=['Mon', 'Tue', 'Wed', 'Thu']) . If no index is explicitly given, the Series defaults to standard indices (0-length - 1). We can also create a Series from a dictionary by passing the dictionary into  pd.Series . For example, if we had the dictionary  d={'Mon':0.2, 'Tue':0.4}  we could create a Series with  x=pd.Series(d) .  In the above example, the values  ['Mon', 'Tue', 'Wed...

CST383 Week 1

This was our first week in CST 383 - Data Science. This class discusses machine learning, how to analyze data, and how to use Python for data analysis. Python Array Operations In this week's video lecture we discussed various operations one can perform on python arrays. Slicing:   Python arrays can be accessed using [start:stop:step]. Start defaults to 0, stop defaults to the end of the array, and step defaults to 1. For example, to access the last three elements in an array we would use array[len(array) - 3:], to access the first half of the array we would use array[:(len(array) / 2], and to access every other element we would use array[::2]. Fancy Indexing:  We can use an array as a list of indices we want to get from a different array. For example, array[[0, 2, 4]] would return the first, third, and fifth elements of the array. Broadcasting:  We can perform operations on arrays to quickly perform operations on each element in said array. If we have the array [1, 2...