MAIN PAGE
– elvtr magazine – AI Solution Architect Duc Haba: “To stay ahead of the curve in AI you need to ‘unlearn’ things!”

AI Solution Architect Duc Haba: “To stay ahead of the curve in AI you need to ‘unlearn’ things!”

What solutions architects and business owners need to know about the GenAI revolution.

If 2023 was the year of generative AI, 2024 promises to turn AI into a mainstream technology, says Duc Haba, author of an Amazon best-seller on data augmentation with Python. The California-based technologist started his career at Xerox PARC where he contributed to expert systems, progressing to pivotal roles at Oracle, Viant Consulting and Cognizant. More recently, he has orchestrated AI strategies as CTO at LeVar Burton's RRKidz media company.

His ELVTR course on AI solution architecture, ideal for data scientists, budding solution architects and business owners, will help students learn the ropes of AI through a data-first approach.

In the following Q&A, Haba shares his thoughts on how the generative AI industry will evolve, why you should incorporate user feedback before launching an AI solution, and why ‘unlearning’ is as important as learning new skills.

Why does a solution architect need to master AI?

A solution architect is like a real architect overlooking a building, but instead oversees the creation of a huge software system. What’s different about AI is that it’s a new technology, so it’s not the same as designing an app or website. You need to know more about the technology.

One main thing is to know your data and how to access it. Most time is spent on cleaning the data. 60+ percent of the project time is gathering and cleaning data before you actually get into training or refining AI models. Solution architects may not understand that nuance.

The other thing that’s important for AI solution architecture is ethics, e.g. detecting biases in the data. Sometimes people look for biases way after launch, which is wrong. Take the recent launch of Google Gemini. They finally released it and people found out that its data had racist bias. The result was that Google’s share plunged, they lost $9bn. So as a solution architect if you don't identify data biases early, you are in trouble.

What aspects of architecture development can AI automate?

AI’s beauty is that any corporation can use it to increase its profitability. For me, that’s writing code, business analysis, technical reports and AI strategy documentation. Or I use ChatGPT4 and Google Gemini to summarise an article or a concept.

So I don't use AI as a crutch. It gives me an idea, sort of a human assistant helping me write articles, report summaries, code. But I'm the one responsible for the content, so I reread what AI said and I update if needed.

There's so much hype about AI that it’s hard to discern what is real and what hype. The term itself is vague. Lately, what people mean by AI is generative AI, which is based on large language models (LLM) and transformal models, but AI is broader. Even convolutional neural network (CNN), recurrent neural networks (RNN), time series and machine algorithms are AI.

Most people when they say ‘AI’, they mean artificial neural networks, but there are other types of AI too like random forests, linear regression, even rule-based AI playing chess.

Can you share an example where AI helped you with a project?

I'm working on many machine learning (ML) projects. One is about COVID-19 for a US insurance company. We are creating a system looking at nurses to predict which ones will experience burnout. Something like that is impossible to do without generative models and CNN.

The problem is to have all the input data, including videos and phone conversations, translated. It’s impossible to solve that problem with traditional programming techniques. It can only be solved by AI.

Another project I am working on is with an automotive manufacturer, focusing on identifying which vehicles people are driving. We look at traffic data, because the company wants to know how many cars are sold, how people are using them, what’s the most popular car in San Francisco etc. You can look at sales, but also observe cars on streets, using data that comes from image classification.

You could hire people to count cars and do surveys. But that's not as efficient as ML. And you only need to do it once. They set up CCTV cameras in the city, so you can spot cars in real time, rather than just analyze sale numbers. Again, it’s a problem that can be solved by humans, but AI improves what humans can do.

How can AI improve predictive analytics and forecasting?

Businesses are doing it in the traditional way, for example predicting next year sales, how many items will be sold in a supermarket etc. We use traditional databases and structured query language to get that information, so it’s not an unsolvable problem.

What AI brings to the table is more accuracy. When you forecast what my next year's sales will be, you use data coming in, say yearly data, which can be 100,000 or even a million data points. AI doesn’t just analyze regular data from a database. It takes into account social media feedback, what people say, where your product was sold last month in real time, and analyses it in hours, not months.

So AI can offer consistency: an overview every day or every week. It does not replace data analysts, but does what I call pre-analysis, like a nurse doing triage.

First it looks at the data and gives you a suggestion. You take that idea as a starting point and let your data analysts check if that makes sense. Humans are still responsible for the final result. But previously analysts needed five months for a forecasting report, now you do the same analysis in a day.

With GenAI, you can also ask the system why it gave you that specific result. Or add parameters: what about California, an older cohort, COVID-19, war, anything. All those ‘what if’ scenarios, you can run efficiently with AI.

Maybe blue jeans sales will affect your company somehow. You won't spend $100,000 a month to explore that crazy idea. AI will tell you the reason why you should or shouldn't do it. Sometimes you can come across something that will be the key to your business’s success.

You will also be talking about user feedback, so how do you incorporate that into your AI solutions?

As a solutions architect, you should incorporate user feedback into solutions, not after you have finished them. You can’t do user testing and improve your solution afterward.

That’s key, because GenAI is a new technology. No one knows how it behaves. Sometimes it hallucinates, taking different paths from what you asked for. Feedback helps you find out if it’s biassed or not. So the feedback has to be built in before the product is launched.

You build that into your QA process and your pre-launch so that the data keeps coming to see if something makes sense, whether there’s a particular case that you didn’t think about. That helps you retrain your models and make them more accurate.

For example, when you build a program for buying and selling stocks, you don't' trade immediately. If AI was that good, I would be a billionaire now! What it’s good at is gathering information for your analysts to make recommendations.

But it's feedback on a daily basis. The stock changes every day. If you built your model a month ago, the stock dynamic changes so quickly that the analysis may not be good anymore. So when you build a stock analysis program you incorporate that feedback. That’s true for all AI systems.

Recommended courses

INTRO TO AI PRODUCT DESIGN

ROBERT REDMOND
DESIGN PRINCIPAL OF AI PRODUCT DESIGN, EX-IBM

AI FOR ARCHITECTS

JACOB RUSSO
SR. COMPUTATIONAL DESIGN ARCHITECT, SKIDMORE, OWINGS & MERRILL

All courses

You will also cover proprietary and open-source tools, so what are the criteria for making the right choice?

You have to pay to use private tools. Gemini, for example, belongs to Google, so you pay a licence fee. The cost adds up a lot when you have a system that does a million queries a day.

Say that you are creating a recipe online and you search for recipe-related data. When you pay for a few sets of queries it doesn't matter, but if it’s a million requests it does. An open source system is a free pre-trained model, but not exactly the same. It's a lower quality model, and not as accurate as proprietary ones.

The negative thing about private data is that your data is not secure. So in the healthcare industry or finance the data cannot leave your VPN. When you look for the best treatment for a patient, all that private data can’t leave the system. You can re-licence the system from OpenAI, GPT-3.5-Turbo or GPT-4, to put it into your servers, but it's very expensive.

You could use the pre-trained model, but the problem with training open source models is that you need a large data team. It's not like a framework for building websites. Most companies do not use open source, because that’s too complex. They cannot spend tens of millions on an AI project, knowing that the model is not that good.

So there is a trade-off between using models on HugginFace and Kaggle. Say you need data to train skin cancer models. You can use Kaggle to do it or you can purchase something. If a company uses open source, they are open to liability. If something goes wrong, it's your fault.

If you use high quality data bought from another company, it’s a guarantee that they can’t sue you for problems in the data. Sometimes open source is better, because it allows you to own the intellectual property, you own the IP. If you use private tools, you don't have that, but you have legal protection for some complex AI systems.

If you have certain data handy (eg number of users, growth forecast), can AI tell you which database management system (eg MySQL) is best to use?

Your IT staff can use AI to ask these questions. You have all your data and user feedback, so you ask what data storage system is better, eg Google Datastore. It will be able to give you an answer for each system and how it will affect your data.

But don’t expect AI to have all the answers. You give it all the data to do research and give you a summary and then you use your own experience to make a decision. And you can ask questions, eg what if my data has a high throughput rate? What API architecture do you recommend?

Or if it’s not high throughput but it's very highly relational, you can ask how to change your relation in your table to have a better throughput, basically lower cost.

In which fields do you think generative AI will be more disruptive?

GenAI is built on everything on the internet, from poetry to videos. But you have to look at it as an equalizer that closes a gap. A programming student can be good at Python programming when they are working with a GenAI co-pilot, but a more experienced one will go one step further. So it’s closing that gap between professionals and beginners.

That’s true for every field. For example, if you are a new salesman and you don't know when to use Salesforce or contact someone, GenAI could help you with that. Experts use GenAI to become better.

When I write code I use GenAI. I ask my team members to use GenAI to write at least 50% of code, so that they get comfortable with it. They shouldn’t spend two days searching on stack overflow or doing binary search. AI can do it for you. Focus on making it better. Every programmer hates to write unit tests, so don’t spend a week on that, use GenAI to write and run unit tests.

How do you stay ahead of the curve in terms of integrating the latest AI trends into business solutions?

I do that by producing new content. Last year I wrote a book on data augmentation. The process of writing made me research and develop more skills than just data analysis and augmentation.

So by creating new content, blogs, articles, you learn how to use new tools. Writers should be using ChatGPT to create something better than what they can write themselves.

You also have to be willing to unlearn things and familiarise yourself with new concepts. And it's hard for me to ‘unknow’ what I have learned. When you have a career and you think you can do things better because you have been doing this for a long time, it’s tough.

Your last class is about ‘continuous learning and adaptation’, so what skills do you think will keep people employable 10 years from now?

The first skill is understanding AI in general, not just generative AI. As a young professional, it’s almost as important as maths and reading. You need that as a foundation that you can tap into wherever you need.

And that’s true for a young engineer or a seasoned executive who wants to stay relevant. You don’t have to be an AI programmer. But it’s important to know enough to understand AI.

Say you are an executive and you want to start using AI in your company, but you don't know how to start because there's so much information about it. Mastering the basics will help you understand how an AI process should be designed. That will help you run your business.

AI is not just another trend that will fade away 5-10 years from now. This is a revolution in business that will be the norm moving forward. Companies that don't invest in AI will quickly stay behind. It’s like you are riding a bicycle and you have to compete with someone who’s on a motorcycle.