Please enable JS

PROBABILITY & NOTES

On my upcoming machine learning course, probability, and completion of R notes.

PROBABILITY & NOTES

SEPTEMBER 7, 2016/BARRY COLONNA

Machine Learning

img

I think I finally decided on my next course of study. It seems like a big commitment to me for some reason, which is why I want to make sure I choose the best class. Realistically, since it’s free, I can stop at any point if it’s not what I expect or need and switch to something else. I just don’t want a repeat of the fiasco that was my first introduction to data science.

What have I determined?

I’m going to take the Machine Learning Specialization from University of Washington on Coursera. I’ve spoken about it in the previous two journal entries so I won’t go into too much detail here. Six courses are included in the specialization, which cover everything that I’m interested in learning.

I’m really excited to begin the first class. I searched through edX, and there’s nothing quite like this concentration there. If these classes go well, I’ll be ready to take more specialized and advanced courses in data science, machine learning, and programming.

By the time I’m finished with it, I’ll nearly be completed with math too. It seems like it’s just around the corner, but I still have another 8 months or so before I reach that point.

It’s still exciting to be able to learn so much through Massive Open Online Courses (MOOC).

Statistics

img

Speaking of math, I’m progressing through statistics on Khan Academy quickly. I expect that the class will be completed by next week.

Each of the statistics lectures are much longer than the majority of my previous mathematics lectures. This isn’t a problem, but seemingly short lessons easily turn into several hours. I say that because perhaps I may not be as close to finishing as I believe, but I’m fairly confident.

I am realizing that I’ll need to take a full, or more advanced, statistics class later on once I complete linear algebra. This class covers a lot of important information, but it doesn’t contain all the areas that I’ll need. It appears to be more of a supplemental learning class.

img

Either way, it is useful for my current needs. I understand many of the concepts I studied in analytics much better and it should be useful for most of the statistical concepts that will be covered in machine learning.

This week, the majority of my studies included working with probability: expected outcomes, combinatorics, z-statistics, t-statistics, p-value, confidence intervals, etc.

In one lesson, Sal Khan was asked to calculate the probability of winning the Mega Millions lottery. The result was 1 in 175,711,536. By contrast, you have a 1 in 100,000,000 chance of being struck by lightning twice in your life time. Nearly two times more likely! I love that kind of trivia. As an aside, you’re also more likely to be canonized as a saint than win the lottery, although in the former you have to be dead.

R Notes

img

I finished reading through and transcribing all of my R notes from The Analytics Edge on edX.

There were so many disorganized notes!

I now have an 18-page Word document with everything I need to recreate any of the statistical analyses we learned in the class. It took forever, but I’m glad I did it. It’s nearly impossible to adequately search through the text files I saved from each lesson, so this is much better.

I added helpful information next to commands and arguments so I know exactly what I’m doing when I use them in the future. The machine learning specialization that I’m about to embark on uses Python instead of R, so I have a feeling I’ll forget everything I learned in R by the time the learning track is over.

R and Python both have pros and cons, which I won’t get into today. From what I read, it seems that Python is often the preferred language for many data science purposes. R is also more complex with a steeper learning curve. However, I think R is more customizable and can be used in more applications. Due to its complexity, I always thought if I could master R, I could work in any programming language. I could be wrong, as I’m clearly not an expert.

I’ll let you know how I feel once I start using Python. Perhaps it will be what I chose to work with in the future!

Conclusion

I’m sorry about the shorter than usual journal entry. With only class, I don’t have as much to say. I’ll probably begin my machine learning class tomorrow, so I’ll no doubt have much more to write about next week. Thank you for joining me. I hope you are happy and reaching for your dreams!





JOURNAL

This journal will be about my journey to become a data scientist and better myself through education and fitness.

I hope that my words inspire you to follow your dreams and show you that it's never too late to make a change.

SCHEDULE

Data science posts every Wednesday.

Health posts every other Sunday.

Follow Barry