Please enable JS

BASIC MACHINE LEARNING

On machine learning, text analytics, and rational equations

BASIC MACHINE LEARNING

JULY 27, 2016/BARRY COLONNA

This week was all about text analytics, machine learning, and rational equations.

That may not sound all that interesting, and some of it isn’t (I’m looking at you, rational equations), but I’ve really been looking forward to working on the first two topics.

Basic Machine Learning

img

This week we did some basic machine learning on The Analytics Edge on edX. That is so cool! Machine learning is such as amazing field with endless applications.

Think Skynet. Amazing.

Granted, for this assignment, we weren’t creating programs that would eventually seek to wipe out humanity. Also, machine learning does NOT mean artificial intelligence. Don’t get me started on the misuse of A.I. because I’ll go on a three-page rant.

For our assignment, we worked on being able to identify letters, such as in a post office scanner. Exciting, right?

I know, it doesn’t sound all that thrilling. But the amount that goes into something so seemingly basic is pretty incredible. For the purposes of our assignment, we learned how to identify the letters A, B, P, and R. We were able to predict those letters with 98% accuracy from over 3,000 versions of distorted fonts.

Imagine having to do all of that by hand!

What is Machine Learning

img

Machine learning is a division of computer science, and it’s a huge part of data science.

Put simply, it involves using algorithms to learn from and make predictions about data.

It allows a computer to learn and grow when provided new data or information, without being explicitly programmed where to look. It’s closely related to, and sometimes used interchangeably with, predictive analytics.

Text analytics (natural language processing), which I write about briefly below, is a form of machine learning.

It’s used in a wide range of fields, including:

  • search engines,
  • internet & credit card fraud detection,
  • robotics,
  • predicting user preferences (e.g.: Netflix),
  • video games,
  • handwriting recognition,
  • speech recognition (Apple Siri, Google Now, or Amazon’s Alexa),
  • medical diagnosis,
  • online advertising,
  • classifying DNA sequences,
  • economics,
  • bioinformatics,
  • and much more.

As time goes on, machine learning is only going to become a bigger and more important part of our lives.

Text Analytics

Natural language processing is concerned with improving interactions between computers and humans. Its goal is to build, analyze, or generate languages that humans use naturally.

Though they can process huge amounts of information, computers have historically had trouble with natural language due to slang, shorthand, social context, or differences in regional dialects. It makes analyzing things such as texts, emails, Twitter tweets, or speech especially difficult.

img

For our assignments, we attempted to extract information from tweets and a set of emails.

For Twitter, we were interested in determining, through statistical analysis, how people felt about a specific company. I won’t bore you with all of the steps involved, but negative reviews statistically contained one of three words.

The emails we analyzed were publicly available copies from the Enron corruption case. We looked to see if it was possible for a computer to identify all of the emails related to the criminal case without requiring a person to read every single email. It wasn’t perfect, but the commands could be modified to be more accurate.

Text analytics in the field of criminal justice is relatively new and it’s being used more and more in court cases.

That was my introduction to text analytics. I’m still working on the end of week assignments, so I’m sure I’ll learn a little more.

We didn’t do anything too terribly complex, as this is a beginner’s class that only touches on these topics. But I found it incredibly interesting and I look forward to learning how to write algorithms one day.

Rational Equations

img

I was wrong, yet again with my math schedule. Last week, I estimated that I had completed 25% of my algebra II lessons.

Then I opened the longest lesson on earth, covering rational equations. This has been the longest lesson that I have worked on yet on Khan Academy.

That’s not a fault to Khan, by any means. It’s just that there is a lot of course material within the rational expression lesson. It covers adding, subtracting, multiplying, and dividing fractions containing polynomials. Some examples are pictured.

I must say, I’m not a big fan.

It’s not that I find the subject difficult. Quite the contrary, actually. It’s mostly that I get tired of doing lengthy problems that I’m relatively comfortable with.

img

But I can’t skip any of the lectures or quizzes no matter how well I know the material. I’m compelled to do everything before moving on. I have issues. . .

But, I want to make sure I don’t skip something important. Without the lectures, I wouldn’t have recalled how to solve these problems and without the quizzes, I wouldn’t have been able to test that knowledge. I usually learn something new in every lecture video, which is great. I’m just not overly fond of these problems.

I still like math, mind you. I don’t want you to think I have changed my plans. Sometimes I just get burnt out doing hours upon hours of the same topic every day. I’m almost done learning how to solve rational equations, then I’ll move on to a new topic or one that adapts what I learned here. I look forward to that.

Conclusion

Every single week, I sit down to write my journal and cannot think of a single word to write. Then it turns into this!

I hope you found the topics as fascinating as I do. I definitely suggest reading more about machine learning and natural language processing. People are doing some pretty incredible things with them right now.

Until next week! Keep learning, and I hope you’re all well!





JOURNAL

This journal will be about my journey to become a data scientist and better myself through education and fitness.

I hope that my words inspire you to follow your dreams and show you that it's never too late to make a change.

SCHEDULE

Data science posts every Wednesday.

Health posts every other Sunday.

Follow Barry