What is the Prime Objective?

Author’s note: I have relocated from Silicon Valley to Research Triangle Park in North Carolina. Rick, the founder of Convergent Technologies, recently set up shop in Austin. He and I have a running bet on what city will be the bigger tech hub in ten years. We will keep you updated on who is winning!

The title of today’s post is not a typo. Star Trek fans know it is the prime directive—not objective. With that disclaimer, I’d like to discuss the scaling down of Watson, IBM’s flagship AI system. Due to limited success, they reduced their team and resources, and this year, they halted sales. This was surprising to me. I watched with rapt enthusiasm as Watson took down the reigning Jeopardy champions almost a decade ago. Then, it was announced that its next feat would be to topple cancer!

Why use AI . . . because scientists run on papers. Based on a Pubmed search, in 2018 alone, there were 203,607 research papers published on cancer. For a fun thought experiment, think of it this way: medical papers are relatively short at around five pages, and research papers are longer at around 12 pages. Let’s split the difference, giving a bit of favoritism to the medical papers, and call it an average of about seven pages. Then let’s assume that you are a very fast reader, taking in about a page every minute. It would take approximately three years to read the current literature on cancer if you forewent pesky things like sleep and food. Granted, all those papers are probably not relevant to a physician/scientist’s research, but there is a growing problem of how to process all that data (quality control is also an issue, but that is another topic altogether).

Enter AI, which is fast on the buzzer and can answer all those questions about famous rivers that Jeopardy tends to throw in there. If it can win at this game, why not point it at another problem and let it go? So why did it essentially fail? The tipping point came when it recommended a potentially fatal treatment for a cancer patient.

If you look at the greatest successes in AI that are dominating the headlines, you’ll note that all have one thing in common, a simple objective: win this game; get an object from point A to point B without hitting anything. With cancer, it is not so simple. My cousin is an outstanding oncologist doing amazing research at Indiana University. He is level-headed with the exception of one phrase that gives him a teeth clench, “It cures cancer.” I cannot go into enough detail here to discuss all the problems with that statement, but I can say with as much confidence as a scientist lends that if someone tells you something definitively cures cancer, he or she is knowingly or unknowingly misleading you. Some questions you might want to follow up with: What type of cancer cells? (Cancer is not monolithic.) What do you mean by cure? Do you mean prevention of occurrence, cutting off the causative agent (like HPV virus, mutant cells, genetic predisposition), treatment of existing cancer, or prevention of recurrence?

Algorithms need a clear direction to improve. Nuance in their solutions leads to muddying the waters of how the mathematical equations are trained. Additionally, the problem intended to be solved by the algorithm needs to be nearly limitlessly testable. We can’t let an AI endlessly play against patient outcomes like we could with a game like Go. “Machine learning” and “AI” are commonly used interchangeably, but there is an actual distinction. A true AI should be able to handle this nuance and thrive in situations in which it has not seen data to a specific outcome. Unfortunately, we are not there yet, but I do find it encouraging that IBM tried.

Big data is daunting and confusing, yet it is becoming more of a necessity to remain competitive in the modern workplace. We here at Convergent Technologies specialize in simplifying the seemingly opaque. We have helped many organizations of various scales implement sensible data solutions. Let us help get you there!

The Devil Is in the Fee-tails

We’ve created a stock market that moves too darn fast for human beings.

—A former vice chairman of Nasdaq

First, the take home: Nearly all investment firms are using some form of algorithms and machine learning. The minor performance increases of one versus another generally do not outweigh the fees charged. Go with the cheapest no matter how much they brag about their tech.

Recently, my wife was listening to a book, and I happened to catch a little bit of the content, which focused on value investing. It was hitting on all the usual notes—buy and hold has never really been beaten—and there was the obligatory reference to some contact the author had with Warren Buffet. The part that made me chuckle was that stock prices are updated up to the second. Given the automation of the market, the notion of a second seems almost quaint.

There is a great book by Michael Lewis called Flash Boys, which describes high frequency trading in depth, but I will touch a little bit on the concepts mainly because I find the words associated with it hilarious. Essentially, trades can be made without any human input. Just put some rules in place and go, right? It reminds me of Ron Popeil, “Set it and forget it.” If a computer could learn the right rules, it could dominate and make me a bunch of money!

It’s important to note that this trading is wide-open to exploitation, but we can trust the traders to do the right thing, right? In a scheme reminiscent of Tom Sawyer, traders can “spoof” the market. They can place a ton of orders on a stock so everyone thinks it is in high demand; the price goes up, and since they are moving so fast, they can sell their stock at the high price and cancel their orders before anyone can blink an eye. Traders can also get ideas of stock changes that would be caused by their own customers and then position themselves quite nicely before executing the trades. Additionally, they can take advantage of the time it takes for information to transfer from one exchange to another, which is not a lot. All of this happens automatically. It can lead to a flash crash, which sounds more fun than it actually is. If everyone has automated rules in place, a mild fluctuation can cause a snowball reaction; the bottom will drop out of everything and pop right back up in a matter of minutes. An example of a mild fluctuation is a “fat finger error,” which refers to someone entering the wrong order information.

This sounds crazy, but there is a tremendous amount of competition for fractions of everything. The good news is that these are just fractions. A sophisticated trading algorithm is going to edge out competitors by only a fraction of a percentage, and most algorithms written by someone on a laptop in a Starbucks could at least compete. Data scientists often joke about the lengths to which some of us will go to get a point or two in performance on a model. A fraction may be a huge deal if you are working at a huge volume, but if someone is offering to increase your profits by 0.2 percent at a 2 percent cut, the math really does not add up. Go with the group that is keeping a really low overhead and passing this on to its customers because it is all automated now.

Big data is daunting and confusing, yet it is becoming more of a necessity to remain competitive in the modern workplace. We here at Convergent Technologies specialize in simplifying the seemingly opaque. We have helped many organizations of various scales implement sensible data solutions. Let us help get you there!

Automating the Automation

Automation is a scary thing. While I was in Silicon Valley, I stayed at a house that was a short distance from Facebook. In this area, it is almost impossible to get by without roommates, and I had four. Three of them worked at Facebook. I got to know the inner workings a bit, and I met a lot of the really bright people who were working there. One was an extremely talented software engineer who seemed to be getting a promotion every month. He was working in a high-level position in one of the biggest and most profitable companies in the world, and he was concerned that automation would replace his job. Specifically, he was worried about Google developing programs that could write programs. This makes me think of an M.C. Escher print that I hung in my dorm room; it seemed deep at the time but now feels a bit silly and pretentious.

That employee’s concern is legitimate especially given the current rise in generator technology (see my previous blog post). The question is: can we automate machine learning? Well, yes, definitely some aspects, but end to end is still a ways off. Let me clarify. There are three steps in machine learning. The first is data preprocessing/cleaning/planning. The second is the actual machine learning. The third (and sometimes unfortunately overlooked) is interpretation and sanity checks on the model. Arguably, there could be a fourth step discussing deployment and maintenance, but it is outside the scope right now.

Of the three steps, the shortest and most fun is the machine learning. The other two steps require a significantly greater amount of work. Which steps can be automated? Data preprocessing is usually very unique to the process. Generally there is the act of pulling from several tables, there is a lot of noise, missing data, and things that just do not make sense. It is almost always a mess, but it is a snowflake in that no two messes are the same. There are some best practices to look at when determining if the data is suitable for machine learning, but that is pretty early on. The third step where there is a bulk of work is interesting. There is a tremendous amount of metrics to measure the performance of a model. A good machine learning scientist will look at these metrics in the following way: terrible performance = unsolvable with this data; fair performance = potential room for improvement; good performance = the sweet spot; great performance = something is wrong. This process requires a human set of eyes.

Machine learning is equivalent to asking a teenager to clean your house. The house could look spotless after a few minutes with an extremely lumpy rug. That leaves step two. We have the data ready to go; now we just need to find the right model with the right parameters. While there has been some work on what data works best with which model, the “no free lunch” model still applies here. You have to search to find the best. There are several free packages that are freely available to automate this process—sklearn and H20—to name a couple. This is also a growing area in the service and startup side of machine learning. The big dogs are getting in on it as well with Amazon Sagemaker, Azure Machine Learning, and Google AutoML. The take- home is that finding a model is easy, but everything around it still requires traditional human participation.

Big data is daunting and confusing, yet it is becoming more of a necessity to remain competitive in the modern workplace. We here at Convergent Technologies specialize in simplifying the seemingly opaque. We have helped many organizations of various scales implement sensible data solutions. Let us help get you there!

 

So there is this thief, and there is this cop…

Let’s do something fun today. I have seen the latest craze (or at least it was a couple weeks ago) on Facebook of face aging. Personally, I never watched the last scene of Indiana Jones and the Last Crusade and thought, “That would make a great app. People would love to see themselves aged at an insane rate!” I guess the next big thing will be an app that lets you see what it would look like if someone ripped out your heart.

The big question is: How is this done? Well, there are several methods, but one that could work is something called a generator adversarial network (GAN). That sounds a bit intimidating, but if you break down the name, it becomes easier. Network just implies that we are going to use some neural network, which is just something with nodes and weights. Next is the generator part. This can be the generation of anything really. It can be a picture, text, music, or even a signal. You could write a generator for chocolate chip cookie recipes. The possibilities here are the reason this is a very hot area of research in the field.

Now for the tricky part: “the adversarial.” This goes back to the title. Within this fascinating method, there are actually two models being built and trained. One model is a rotten forger (and consequently the one we will be interested in at the end), and the other is a noble upstanding cop. The forger’s job is to create fakes, and the cop’s job is to distinguish the fakes from the real deal. For the basic version of this, we just set an output. Let’s keep to our example and say “faces.” We want to be able to generate pictures of faces. More specifically, we want our forger to create new fake face images that can pass the cop undetected. How do we train it? Well, we start with a bunch of pictures of faces. We give our forger random numbers to transform into a face (yeah, this part is a bit counterintuitive), which allows us to generate a nearly infinite number of new faces. We then feed the fakes and the real pictures to the cop, and a cycle starts in which each gets better and better at his job until we have created a master forger.

Great, now we can make faces, but how do we age them? This is where it gets even more complex—but not outside the scope of our example. If we have “before and after” examples to show the cop, he can determine whether or not the transformation is genuine, so we would need a lot of older and younger pictures of different people. We can just show our forger the “before” and tell him to do the transformation and start the cycle all over again, having the two battle it out until nothing the other one can do will change the game (really rough approximation of Nash equilibrium from game theory). Now our generator can take a picture of someone’s face and make realistic changes to it so that the person looks older or at least like he or she has had a rough week. We could get into latent vectors, but that is probably best saved for another time. Additionally, this is just one method for this type of problem, and it is really just an archetype. The types of models used for the forger or the cop are really open ended, and there are quite a few variations on this approach. I suspect we will see more and more of these nifty little applications in the future, but I hope they are more pleasant.

Big data is daunting and confusing, yet it is becoming more of a necessity to remain competitive in the modern workplace. We here at Convergent Technologies specialize in simplifying the seemingly opaque. We have helped many organizations of various scales implement sensible data solutions. Let us help get you there!

Bring Your Own Bias

I enjoy making beer. Generally, I’ll try to make a batch for special occasions or just because it’s mid-January. It is easy and cheap to make—and therefore, ubiquitous. However, there are nearly limitless possibilities for beer recipes, so people’s tastes for it are extremely subjective. Unless you’ve brewed a batch that is described as “chewy,” it’s not objectively bad.

What’s the connection between beer and big data? Let’s say we can generate labeled data cheaply, and we want to separate our data points by “good” and “bad.” It’s a simple process of finding the features that distinguish the groups to the greatest degree possible. This is a no-brainer with beer; there are many people willing to show up and give detailed opinions on it. Then, you can make the beer with the broadest appeal. This happens with many highly subjective things. In Ian Ayres’s fantastic book, Super Crunchers, he even hints that there may be an algorithm for top grossing movies. The book was published in 2007, and the first Marvel movie, Iron Man, came out in 2008. Apparently, the presence of on-screen snow can increase the draw for some odd reason.

We can translate this into a business solution by never underestimating A/B testing and customer feedback. It may be impossible to please everyone, but there is a mathematical route to pleasing as many as possible. If you are coming out with a product, consider any current products you have that are easy to produce. Historical examples can also help you identify essential features for the new product and crunch the right numbers, thus ensuring a successful launch. Everyone has his or her own unique biases and tastes—but think of how many people have a taste for some type of beer!

Big data is daunting and confusing, yet it is becoming more of a necessity to remain competitive in the marketplace. We here at Convergent Technologies specialize in sensible data solutions. Let us help your company or organization make an objective process out of what may seem like a hopelessly subjective one!

The Target

First, I just want to say how excited I am to be part of Convergent Technologies. My dear friend, Rick Gregson, asked me to join the company for the purpose of bringing my knowledge of data science to the table. Additionally, I was invited to contribute a weekly blog, so I will back up for a second and define its purpose.

My bio is available on the Convergent Technologies website if you’re curious about who I am and why I may be positioned to tackle these subjects. I’ve invested a great deal of time studying “big data.” With that said, there is a plethora of material on big data—in particular, AI (artificial intelligence). Many news stories focus on the sensational such as self-driving cars or the toppling of champions of the Chinese “Go” game. This can be daunting and intimidating. If you want to learn more about some of these topics, you can be inundated by math, programming, and statistics.

My overall purpose focuses on two basic questions. The first is, “How does data science impact/potentially improve my daily life?” For example—Why did I use a dating site only to end up on a terrible date, while my Netflix account always knows what I want to watch? Do I need to worry about automation taking over my job? Do we even need doctors anymore when a machine can diagnose our ailments? The second and more pertinent to our consultancy goals is, “Can data science help my business/help me make money?” For example—What problems can I solve with data science? Do I have enough data for that? I hired a consultant who just graduated college, and he’s already spent $9,000 of my money on AWS (Amazon Web Services); is that normal?

We will even tackle fun topics like using ML (machine learning) to get a sports betting advantage and maybe even delve into day trading stocks with algorithms. If I discuss specific algorithms in this blog, it will be at an intuitive level only, so let’s leave the math out and opt for comparative examples. However, if you feel like diving deep into the mathematics and science of anything I explore, please feel free to reach out to me. I am always up for that! Feedback is appreciated, and topic suggestions are welcome. Let’s dig in and see what data can do!

Can Data Really Lead to Love?

My wife is from Kenya. This has nothing to do with big data or the fun yet ineffective algorithms (spoiler alert) that I will get into later. This just means that she has not seen a lot of the same movies that I have, so we have been going on a bit of a romantic comedy binge of late. This is her favorite genre. This is not mine. My favorite genre (big surprise) is science fiction. Very rarely have these two genres ever met. However, I would challenge someone to re-edit “2001: A Space Odyssey” into a romantic comedy. Hal 9000 is an impressionable AI, fresh from the planet, trying to make it in the big space station. He thought he had it all figured out until one day. . .

Rick, Convergent Technologies’ indisputably tech savvy senior consultant, and I have often discussed the concept of data and dating since he has actually tried a few online sites designed to streamline the search for a soulmate. I was lucky enough to find my wife working down the hall from me, but for those who haven’t had it that easy, the onslaught of machine learning algorithms begs the question: Can they help me find love?

There are a couple great papers on this. The first, User Recommendations in Reciprocal and Bipartite Social Networks–An Online Dating Case Study, takes a Netflix approach in building a collaborative filtering technique. i.e. Bob likes Suzy. Bill is like Bob. Sally is like Suzy. Bill will like Suzy. They incorporate an attractiveness metric in there for good measure, so that it is not just like buying groceries. The second is Online Dating: A Critical Analysis from the Perspective of Psychological Science, which digs into the psychology side of things. Its conclusion is that online dating offers great access to a tremendous amount of people, but metrics are a poor substitute for experience . . .

With all of our technical expertise and the sophistication of modern systems, maybe it would still be better to invite someone over for a good romcom than wait to hear that “you’ve got mail.”

 

Competitive Advantage

I was asked to have a cup of coffee by a fellow data scientist in the valley. He had a particular motive. He wanted to know if there was a way to make a profit in daily fantasy sports betting. He had done well betting some games in the NBA, but when he tried baseball, he was getting his clock cleaned. He wanted to know why, so I asked about his approach and methods. He was closed off to sharing, which in the era of open source, was a bit of a red flag—but I pressed on. I asked, “What exactly  are you trying to do?” His response was that he wanted to have better projections than everyone. I asked him why he wanted to do that. He stared back blankly at me in the same way my four-month-old son does when I ask him why he keeps trying to put everything in his mouth. I expounded on it. Statisticians love love love baseball. They love it so much, that love manifested itself into a Brad Pitt movie. If bioinformatics did this, we would get a feature film starring Tommy Wisseau. Therefore, it would be extremely difficult to outproject the leading sites. So how do you gain a competitive advantage when everyone is already using state of the art? First, make sure everyone is using state of the art. Rather than building a better mousetrap, maybe find where the mice are. In this particular example, I asked if they tracked the users’ stats. He said they did. I told him to follow the 80/20 rule and go after the weaker players. If that did not work then change to a different sport. The takeaway is that data science is much like gambling in general; a lot of progress is made in the strategy and not the implementation. No one group will compete against Google, but luckily Google is not ubiquitous . . . yet. Pick the battles that no one else is fighting.