Getting Your Company and Your Cloud AI-ready: Ebook to Rearchitect Your infrastructure to Unlock the Potential of AI

Getting your company and your cloud AI-ready: Ebook to rearchitect your infrastructure to unlock the potential of AI

Our partner Google Cloud created a guide for technical leaders like yourself with a roadmap to build a future-proof foundation for AI innovation. With an infrastructure that can fuel the next generation of your business, new opportunities to operationalize AI will empower teams to generate solutions to legacy challenges.

In this eBook, you will discover:

  • The infrastructure considerations that can determine AI success or failure — examining cost, scalability, security, and performance dimensions
  • Actionable strategies to evaluate AI platforms, optimize resources, and maximize the value of your AI tools
  • How and when to consider adopting managed machine learning offerings like Vertex AI and flexible container environments like Google Kubernetes Engine (GKE) to ease the operational burdens of your team
  • Best practices for leveraging specialized virtual machines (VMs) optimized for AI, including and equipped with GPUs and TPUs.

Ready to tap into the power of generative AI?​​​​​​​

 

Submit your contact information to get the report:

The Executive’s Guide to Generative AI: Kickstart Your Generative AI Journey with a 10-Step Plan 

The Executive’s Guide to Generative AI: Kickstart Your Generative AI Journey with a 10-Step Plan 

 

 

Not sure where to start with generative AI?See what your industry peers are doing and use Google Cloud’s 10-step, 30-day plan to hit the ground running with your first use case

AI’s impact will be huge. Yet right now, only 15% of businesses and IT decision makers feel they have the expert knowledge needed in this fast-moving area.This comprehensive guide will not only bring you up to speed, but help you chart a clear path forward for adopting generative AI in your business. In it, you’ll find:

  • A quick primer on generative AI.
  • A 30-day step-by-step guide to getting started.
  • KPIs to measure generative AI’s impact.
  • Industry-specific use cases and customer stories from Deutsche Bank, TIME, and more.

Dive in today to discover how generative AI can help deliver new value in your business.

 

Submit your contact information to get the report:

Get Your Copy of Google Cloud 2024 Data and AI Trends Report

Get Your Copy of Google Cloud 2024 Data and AI Trends Report

 

 

Your company is ready for generative AI. But is your data? In the AI-powered era, many organizations are scrambling to keep pace with the changes rippling across the entire data stack.

This new report from Google Cloud shares the findings from a recent survey of business and IT leaders about their goals and strategies for harnessing gen AI — and what it means for their data.

Get your copy to explore these five trends emerging from the survey:

  • Gen AI will speed the delivery of insights across organizations
  • The roles of data and AI will blur
  • Data governance weaknesses will be exposed
  • Operational data will unlock gen AI potential for enterprise apps
  • 2024 will be the year of rapid data platform modernization

 

 

 

Submit your contact information below to get the report:

Harnessing AI Power: Building the Next Generation Foundation

Harnessing AI Power: Building the Next Generation Foundation

 

Author: Antti Pohjolainen, Codento

Artificial Intelligence (AI), that field which imbues machines with the power to ‘think’,  is no longer solely the domain of science fiction.  AI and its associated technologies are revolutionizing the way businesses operate, interact with customers, and ultimately shape the future. AI will have to sit at the core if organizations wish to be truly future-proof and embrace sustainable growth.

Yet, building the infrastructure to handle AI-driven projects can be a significant challenge for those organizations not born ‘digital natives’. Here we’ll outline some strategic pathways towards an integrated AI future that scales your business success.

 

Beyond Hype: Real-World Benefits of an AI Foundation

AI sceptics abound, perhaps wary of outlandish promises and Silicon Valley hyperbole. Let’s cut through the noise and look at some solid reasons to build a future upon a NextGen AI Foundation:

  • Efficiency reimagined: Automation remains a prime benefit of AI systems. Think about repetitive manual tasks – they can often be handled more quickly and accurately by intelligent algorithms. That frees up your precious human resources to focus on strategic initiatives and complex problem-solving that truly drive the business forward.
  • Data-driven decisions: We all have masses of data – often, organizations literally don’t know what to do with it all. AI is the key to transforming data into actionable insights. Make faster, better-informed choices from product development to resource allocation.
  • Predictive powers: Anticipate customer needs, optimize inventory, forecast sales trends – AI gives businesses a valuable window into the future and the chance to act with precision. It mitigates risks and maximizes opportunities.

Take our customers BHG as an example. They needed to implement a solid BI platform to service the whole company now and in the future. With the help of Codento’s data experts, BHG now has a highly automated, robust financial platform in production. Read more here. 

 

Constructing Your AI Foundation: Key Considerations

Ready to join the AI-empowered leagues? It’s critical to start with strong groundwork:

  • Cloud is King: Cloud-based platforms provide the flexibility, scalability, and computing power that ambitious AI projects demand. Look for platforms with specialized AI services to streamline development and reduce overhead.
  • Data is The Fuel: Your AI systems are only as good as the data they’re trained on. Make sure you have robust data collection, cleansing, and governance measures in place. Remember, high-quality data yields greater algorithmic accuracy.
  • The Human Touch: Don’t let AI fears take hold. This isn’t about replacing humans but supplementing them. Re-skill, re-align, and redeploy your teams to work with AI tools. AI’s success relies on collaboration, and ethical AI development should be your mantra.
  • Start Small, Aim Big: Begin with focused proof-of-concept projects to demonstrate value before expanding your AI commitment. A well-orchestrated, incremental approach can help manage complexity and gain acceptance throughout your organization.

 

The Road Ahead: AI’s Power to Transform

It’s undeniable that building a Next Generation Foundation with AI requires effort and careful planning. But, the potential for businesses of all sizes is breathtaking.  Imagine streamlined operations, enhanced customer experiences, and insights that lead to unprecedented successes.

AI isn’t just the future – it’s the foundation for the businesses that will be thriving in the future. The time to join the AI revolution is now. The rewards are simply too great to be left on the table.

 

About the author: Antti  “Apo” Pohjolainen, Vice President, Sales, joined Codento in 2020. Antti has led Innofactor’s (Nordic Microsoft IT provider) sales organization in Finland and, prior to that, worked in leadership roles in Microsoft for the Public sector in Finland and Central & Eastern Europe. Apo has been working in different sales roles longer than he can remember. He gets a “seller’s high” when meeting with customers and finding solutions that provide value for all parties involved. Apo received his MBA from the University of Northampton. His final business research study dealt with Multi-Cloud. Apo has frequently lectured about AI in Business at the Haaga-Helia University of Applied Sciences.  

 

Follow us and subscribe to our AI.cast to keep yourself up-to-date regading the recent AI developments:

What Does a CEO Do?

What Does the CEO of an AI-driven Software Consulting Firm Actually Do During a Workday?

 

Author: Anthony Gyursanszky, CEO, Codento

This is a question that comes up from time to time. When you have a competent team around you, the answer is simple: I consult myself, meet existing clients, or sell our consulting services to new clients. Looking back at the past year, my own statistics indicate that my personal consulting has been somewhat limited this time, and more time has been spent with new clients.

 

And How about My Calendar?

My calendar shows, among other things, 130 one-on-one discussions with clients, especially focusing on the utilization of artificial intelligence across various industries and with leaders and experts from diverse backgrounds. Out of these, 40 discussions led to scheduling in-depth AI workshops on our calendars. I’ve already conducted 25 of these workshops with our consultants, and almost every client has requested concrete proposals from us for implementing the most useful use cases. Several highly intriguing actual implementation projects have already been initiated.

The numbers from my colleagues seem quite similar, and collectively, through these workshops, we have identified nearly 300 high-value AI use cases with our clients. This indicates that there will likely be a lot of hustle in the upcoming year as well.

 

What Are My Observations?

In leveraging artificial intelligence, there’s a clear shift in the Nordics from hesitation and cautious contemplation to actual business-oriented plans and actions. Previously, AI solutions developed almost exclusively for product development have now been accompanied by customer-specific implementations sought by business functions, aiming for significant competitive advantages in specific business areas.

 

My Favorite Questions

What about the next year? My favorite questions:

  1. Have you analyzed the right areas to invest in for leveraging AI in terms of your competitiveness?
  2. If your AI strategy = ChatGPT, what kind of analysis is it based on?
  3. Assuming that the development of AI technologies will accelerate further and the options will increase, is now the right time to make a strict technology/supplier choice?
  4. If your business data isn’t yet ready for leveraging AI, how long should you still allow your competitors to have an edge?

What would be your own answers?

 

About the author:

Anthony Gyursanszky, CEO, joined Codento in late 2019 with more than 30 years of experience in the IT and software industry. Anthony has previously held management positions at F-Secure, SSH, Knowit / Endero, Microsoft Finland, Tellabs, Innofactor and Elisa. Hehas also served on the boards of software companies, including Arc Technology and Creanord. Anthony also works as a senior consultant for Value Mapping Services. His experience covers business management, product management, product development, software business, SaaS business, process management, and software development outsourcing. Anthony is also a certified Cloud Digital Leader.

 

Want to Learn More!

Register to the free AI.cast to:

  • Obtain automatically access to all earlier episodes
  • Access to the new episode once it is published
  • Automatically receive access to all upcoming episodes

 

 

Introduction to AI in Business Blog Series: Unveiling the Future

Introduction to AI in Business Blog Series: Unveiling the Future

Author: Antti Pohjolainen, Codento

 

Foreword

In today’s dynamic business landscape, the integration of Artificial Intelligence (AI) has emerged as a transformative force, reshaping the way industries operate and paving the way for innovation. Companies of all sizes are implementing AI-based solutions.

AI is not just a technological leap; it’s a strategic asset, revolutionizing how businesses function, make decisions, and serve their customers.

In discussions and workshops with our customers, we have identified close to 250 different use cases for a wide range of industries. 

 

Our AI in Business Blog Series

In addition to publishing our AI.cast on-demand video production, we summarize our key learnings and insights in the “AI in Business” blog series.

This blog series will delve into the multifaceted role AI plays in reshaping business operations, customer relations, and overall software intelligence. In the following blog posts, each post has a specific viewpoint concentrating on a business need. Each perspective contains examples and customer references of innovative ways to implement AI.

In the next part – Customer Foresight – we’ll discuss how AI will provide businesses with better customer understanding based on their buying behavior, better use of various customer data, and analyzing customer feedback.

In part three – Smart Operations – we’ll look at examples of benefits customers have gained by implementing AI into their operations, including smart scheduling and supply chain optimization.

In part four – Software Intelligence – we’ll concentrate on using AI in software development.

Implementing AI to solve your business needs could provide better decision-making capabilities, increase operational efficiency, improve customer experiences, and help mitigate risks.

The potential of AI in business is vast, and these blog posts aim to illuminate the path toward leveraging AI for enhanced business growth, efficiency, and customer satisfaction. Join us in unlocking the true potential of AI in the business world.

Stay tuned for our next installment: “Customer Foresight” – Unveiling the Power of Predictive Analytics in Understanding Customer Behavior.!

 

 

About the author: Antti  “Apo” Pohjolainen, Vice President, Sales, joined Codento in 2020. Antti has led Innofactor’s (Nordic Microsoft IT provider) sales organization in Finland and, prior to that, worked in leadership roles in Microsoft for the Public sector in Finland and Central & Eastern Europe. Apo has been working in different sales roles longer than he can remember. He gets a “seller’s high” when meeting with customers and finding solutions that provide value for all parties involved. Apo received his MBA from the University of Northampton. His final business research study dealt with Multi-Cloud. Apo has frequently lectured about AI in Business at the Haaga-Helia University of Applied Sciences.  

 

 

Follow us and subscribe to our AI.cast to keep yourself up-to-date regading the recent AI developments:

Google Cloud Nordic Summit 2023: Three Essential Technical Takeaways

Google Cloud Nordic Summit 2023: Three Essential Technical Takeaways

Authors, Jari Timonen, Janne Flinck, Google Bard

Codento  participated with a team of six members in the Google Cloud Nordic Summit on 19-20 September 2023, where we had the opportunity to learn about the latest trends and developments in cloud computing.

In this blog post, we will share some of the key technical takeaways from the conference, from a developer’s perspective.

 

Enterprise-class Generative AI for Large Scale Implementtation

One of the most exciting topics at the conference was Generative AI (GenAI). GenAI is a type of artificial intelligence that can create new content, such as text, code, images, and music. GenAI is still in its early stages of development, but it has the potential to revolutionize many industries.

At the conference, Google Cloud announced that its GenAI toolset is ready for larger scale implementations. This is a significant milestone, as it means that GenAI is no longer just a research project, but a technology that 

can be used to solve real-world problems.

One of the key differentiators of Google Cloud’s GenAI technologies is their focus on scalability and reliability. Google Cloud has a long track record of running large-scale AI workloads, and it is bringing this expertise to the GenAI space. This makes Google Cloud a good choice for companies that are looking to implement GenAI at scale.

 

Cloud Run Helps Developers to Focus on Writing Code

Another topic that was covered extensively at the conference was Cloud Run. Cloud Run is a serverless computing platform that allows developers to run their code without having to manage servers or infrastructure. Cloud Run is a simple and cost-effective way to deploy and manage web applications, microservices, and event-driven workloads.

One of the key benefits of Cloud Run is that it is easy to use. Developers can deploy their code to Cloud Run with a single command, and Google Cloud will manage the rest. This frees up developers to focus on writing code, rather than managing infrastructure.

Google just released Direct VPC egress functionality to Cloud Run. It lowers the latency and increases throughput  for connections to your VPC network. It is more cost effective than serverless VPC connectors which used to be the only way to connect your VPC to Cloud Run.

Another benefit of Cloud Run is that it is cost-effective. Developers only pay for the resources that their code consumes, and there are no upfront costs or long-term commitments. This makes Cloud Run a good choice for all companies.

 

Site Reliability Engineering (SRE) Increases Customer Satisfaction

Site Reliability Engineering (SRE) is a discipline that combines software engineering and systems engineering to ensure the reliability and performance of software systems. SRE is becoming increasingly important as companies rely more and more on cloud-based applications.

At the conference, Google Cloud emphasized the importance of SRE for current and future software teams and companies. 

One of the key benefits of SRE is that it can help companies improve the reliability and performance of their software systems. This can lead to reduced downtime, improved customer satisfaction, and increased revenue.

Another benefit of SRE is that it can help companies reduce the cost of operating their software systems. SRE teams can help companies identify and eliminate waste, and they can also help companies optimize their infrastructure.

 

Conclusions

The Google Cloud Nordic Summit was a great opportunity to learn about the latest trends and developments in cloud computing. We were particularly impressed with Google Cloud’s GenAI toolset and Cloud

 Run platform. We believe that these technologies have the potential to revolutionize the way that software is developed and deployed.

We were also super happy

that Codento was awarded with the Partner Impact 2023 Recognition in Finland by Google Cloud Nordic team. Codento received praise for deep expertise in Google Cloud services and market impact, impressive NPS score, and  achievement of the second Google Cloud specialization.

 

 

 

 

 

About the Authors

Jari Timonen, is an experienced software professional with more than 20 years of experience in the IT field. Jari’s passion is to build bridges between the business and the technical teams, where he has worked in his previous position at Cargotec, for example. At Codento, he is at his element in piloting customers towards future-compatible cloud and hybrid cloud environments.

Janne Flinck is an AI & Data Lead at Codento. Janne joined Codento from Accenture 2022 with extensive experience in Google Cloud Platform, Data Science, and Data Engineering. His interests are in creating and architecting data-intensive applications and tooling. Janne has three professional certifications and one associate certification in Google Cloud and a Master’s Degree in Economics.

Bard is a conversational generative artificial intelligence chatbot developed by Google, based initially on the LaMDA family of large language models (LLMs) and later the PaLM LLM. It was developed as a direct response to the rise of OpenAI’s ChatGPT, and was released in a limited capacity in March 2023 to lukewarm responses, before expanding to other countries in May.

 

Contact us for more information about our Google Cloud capabilities:

AI in Manufacturing: AI Visual Quality Control

AI in Manufacturing: AI Visual Quality Control

 

Author: Janne Flinck

 

Introduction

Inspired by the Smart Industry event, we decided to start a series of blog posts that tackle some of the issues in manufacturing with AI. In this first section, we will talk about automating quality control with vision AI.

Manufacturing companies, as well as companies in other industries like logistics, prioritize the effectiveness and efficiency of their quality control processes. In recent years, computer vision-based automation has emerged as a highly efficient solution for reducing quality costs and defect rates. 

The American Society of Quality estimates that most manufacturers spend the equivalent of 15% to 20% of revenues on “true quality-related costs.” Some organizations go as high as 40% cost-of-quality in their operations. Cost centers that affect quality in manufacturing come in three different areas:

  • Appraisal costs: Verification of material and processes, quality audits of the entire system, supplier ratings
  • Internal failure costs: Waste of resources or errors from poor planning or organization, correction of errors on finished products, failure of analysis regarding internal procedures
  • External failure costs: Repairs and servicing of delivered products, warranty claims, complaints, returns

Artificial intelligence is helping manufacturers improve in all these areas, which is why leading enterprises have been embracing it. According to a 2021 survey of more than 1,000 manufacturing executives across seven countries interviewed by Google Cloud, 39% of manufacturers are using AI for quality inspection, while 35% are using it for quality checks on the production line itself.

Top 5 areas where AI is currently deployed in day-to-day operations:

  • Quality inspection 39%
  • Supply chain management 36%
  • Risk management 36%
  • Product and/or production line quality checks 35%
  • Inventory management 34%

Source: Google Cloud Manufacturing Report

With the assistance of vision AI, production line workers are able to reduce the amount of time spent on repetitive product inspections, allowing them to shift their attention towards more intricate tasks, such as conducting root cause analysis. 

Modern computer vision models and frameworks offer versatility and cost-effectiveness, with specialized cloud-native services for model training and edge deployment further reducing implementation complexities.

 

Solution overview

In this blog post, we focus on the challenge of defect detection on assembly and sorting lines. The real-time visual quality control solution, implemented using Google Clouds Vertex AI and AutoML services, can track multiple objects and evaluate the probability of defects or damages.

The first stage involves preparing the video stream by splitting the stream into frames for analysis. The next stage utilizes a model to identify bounding boxes around objects.

Once the object is identified, the defect detection system processes the frame by cutting out the object using the bounding box, resizing it, and sending it to a defect detection model for classification. The output is a frame where the object is detected with bounding boxes and classified as either a defect or not a defect. The quick processing time enables real-time monitoring using the model’s output, automating the defect detection process and enhancing overall efficiency.

The core solution architecture on Google Cloud is as follows:

Implementation details

In this section I will touch upon some of the parts of the system, mainly what it takes to get started and what things to consider. The dataset is self created from objects I found at home, but this very same approach and algorithm can be used on any objects as long as the video quality is good.

Here is an example frame from the video, where we can see one defective object and three non-defective objects: 

We can also see that one of the objects is leaving the frame on the right side and another one is entering the frame from the left. 

The video can be found here.

 

Datasets and models overview

In our experiment, we used a video that simulates a conveyor belt scenario. The video showed objects moving from the left side of the screen to the right, some of which were defective or damaged. Our training dataset consists of approximately 20 different objects, with four of them being defective.

For visual quality control, we need to utilize an object detection model and an image classification model. There are three options to build the object detection model:

  1. Train a model powered by Google Vertex AI AutoML
  2. Use the prebuilt Google Cloud Vision API
  3. Train a custom model

For this prototype we decided to opt for both options 1 and 2. To train a Vertex AI AutoML model, we need an annotated dataset with bounding box coordinates. Due to the relatively small size of our dataset, we chose to use Google Clouds data annotation tool. However, for larger datasets, we recommend using Vertex AI data labeling jobs.

For this task, we manually drew bounding boxes for each object in the frames and annotated the objects. In total, we used 50 frames for training our object detection model, which is a very modest amount.

Machine learning models usually require a larger number of samples for training. However, for the purpose of this blog post, the quantity of samples was sufficient to evaluate the suitability of the cloud service for defect detection. In general, the more labeled data you can bring to the training process, the better your model will be. Another obvious critical requirement for the dataset is to have representative examples of both defects and regular instances.

The subsequent stages in creating the AutoML object detection and AutoML defect detection datasets involved partitioning the data into training, validation, and test subsets. By default, Vertex AI automatically distributes 80% of the images for training, 10% for validation, and 10% for testing. We used manual splitting to avoid data leakage. Specifically, we avoid having sets of sequential frames.

The process for creating the AutoML dataset and model is as follows:

As for using the out-of-the-box Google Cloud Vision API for object detection, there is no dataset annotation requirement. One just uses the client libraries to call the API and process the response, which consists of normalized bounding boxes and object names. From these object names we then filter for the ones that we are looking for. The process for Vision API is as follows:

Why would one train a custom model if using Google Cloud Vision API is this simple? For starters, the Vision API will detect generic objects, so if there is something very specific, it might not be in the labels list. Unfortunately, it looks like the complete list of labels detected by Google Cloud Vision API is not publicly available. One should try the Google Cloud Vision API and see if it is able to detect the objects of interest.

According to Vertex AI’s documentation, AutoML models perform optimally when the label with the lowest number of examples has at least 10% of the examples as the label with the highest number of examples. In a production case, it is important to capture roughly similar amounts of training examples for each category.

Even if you have an abundance of data for one label, it is best to have an equal distribution for each label. As our primary aim was to construct a prototype using a limited dataset, rather than enhancing model accuracy, we did not tackle the problem of imbalanced classes. 

 

Object tracking

We developed an object tracking algorithm, based on the OpenCV library, to address the specific challenges of our video scenario. The specific trackers we tested were CSRT, KCF and MOSSE. The following rules of thumb apply in our scenario as well:

  • Use CSRT when you need higher object tracking accuracy and can tolerate slower FPS throughput
  • Use KCF when you need faster FPS throughput but can handle slightly lower object tracking accuracy
  • Use MOSSE when you need pure speed

For object tracking we need to take into account the following characteristics of the video:

  • Each frame may contain one or multiple objects, or none at all
  • New objects may appear during the video and old objects disappear
  • Objects may only be partially visible when they enter or exit the frame
  • There may be overlapping bounding boxes for the same object
  • The same object will be in the video for multiple successive frames

To speed up the entire process, we only send each fully visible object to the defect detection model twice. We then average the probability output of the model and assign the label to that object permanently. This way we can save both computation time and money by not calling the model endpoint needlessly for the same object multiple times throughout the video.

 

Conclusion

Here is the result output video stream and an extracted frame from the quality control process. Blue means that the object has been detected but has not yet been classified because the object is not fully visible in the frame. Green means no defect detected and red is a defect:

The video can be found here.

These findings demonstrate that it is possible to develop an automated visual quality control pipeline with a minimal number of samples. In a real-world scenario, we would have access to much longer video streams and the ability to iteratively expand the dataset to enhance the model until it meets the desired quality standards.

Despite these limitations, thanks to Vertex AI, we were able to achieve reasonable quality in just the first training run, which took only a few hours, even with a small dataset. This highlights the efficiency and effectiveness of our approach of utilizing pretrained models and AutoML solutions, as we were able to achieve promising results in a very short time frame.

 

 

About the author: Janne Flinck is an AI & Data Lead at Codento. Janne joined Codento from Accenture 2022 with extensive experience in Google Cloud Platform, Data Science, and Data Engineering. His interests are in creating and architecting data-intensive applications and tooling. Janne has three professional certifications in Google Cloud and a Master’s Degree in Economics.

 

 

Please contact us for more information on how to utilize artificial intelligence in industrial solutions.

 

Video Blog: Demonstrating Customer Lifetime Value

Video Blog: Demonstrating Customer Lifetime Value

 

Contact us for more information:

 

Customer Lifetime Value Modeling as a Win-Win for Both the Vendor and the Customer

Customer Lifetime Value Modeling as a Win-Win for Both the Vendor and the Customer

 

Author: Janne Flinck, Codento

Introduction to Customer Lifetime Value

Customer analytics is not about squeezing out every penny from a customer, nor should it be about short-term thinking and actions. Customer analytics should seek to maximize the full value of every customer relationship. This metric of “full value” is called the lifetime value (LTV) of a customer. 

Obviously a business should look at how valuable customers have been in the past, but purely extrapolating that value into the future might not be the most accurate metric.

The more valuable a customer is likely to be to a business, the more that business should invest in that relationship. One should think about customer lifetime value as a win-win situation for the business and the customer. The higher a customer’s LTV is to your business, the more likely your business should be to address their needs.

A so-called Pareto principle is often used here, which states that 20% of your customers represent 80% of your sales. What if you could identify these customers, not just in the past but in the future as well? Predicting LTV is a way of identifying those customers in a data centric manner.

 

Business Strategy and LTV

There are some more or less “standard” ways of calculating LTV that I will touch upon in this article a little later. These out-of-the-box calculation methods can be good but more importantly, they provide good examples to start with.

What I mean by this is that determining the factors that are included in calculating LTV is something that a business leader will have to consider and weigh in on. LTV should be something that will set the direction for your business as LTV is also about business strategy, meaning that it will not be the same for every business and it might even change over time  for the same business.

If your business strategy is about sustainability, then the LTV should include some factors that measure it. Perhaps a customer has more strategic value to your business if they buy the more sustainable version of your product. This is not a set-and-forget metric either, the metric should be revisited over time to see if it reflects your business strategy and goals.

The LTV is also important because other major metrics and decision thresholds can be derived from it. For example, the LTV is naturally an upper limit on the spending to acquire a customer, and the sum of the LTVs for all of the customers of a brand, known as the customer equity, is a major metric for business valuations.

 

Methods of Calculating LTV

At their core, LTV models can be used to answer these types of questions about customers:

  • How many transactions will the customer make in a given future time window?
  • How much value will the customer generate in a given future time window?
  • Is the customer in danger of becoming permanently inactive?

When you are predicting LTV, there are two distinct problems which require different data and modeling strategies:

  • Predict the future value for existing customers
  • Predict the future value for new customers

Many companies predict LTV only by looking at the total monetary amount of sales, without using context. For example, a customer who makes one big order might be less valuable than another customer who buys multiple times, but in smaller amounts.

LTV modeling can help you better understand the buying profile of your customers and help you value your business more accurately. By modeling LTV,  an organization can prioritize their actions by:

  • Decide how much to invest in advertising
  • Decide which customers to target with advertising
  • Plan how to move customers from one segment to another
  • Plan pricing strategies
  • Decide which customers to dedicate more resources to

LTV models are used to quantify the value of a customer and estimate the impact of actions that a business might take. Let us take a look at two example scenarios for LTV calculation.

Non-contractual businesses and contractual businesses are two common ways of approaching LTV for two different types of businesses or products. Other types include multi-tier products, cross-selling of products or ad-supported products among others.

 

Non-contractual Business

One of the most basic ways of calculating LTV is by looking at your historical figures of purchases and customer interactions and calculating the number of transactions per customer and the average value of a transaction.

Then by using the data available, you need to build a model that is able to calculate the probability of purchase in a future time window per customer. Once you have the following three metrics, you can get the LTV by multiplying them:

LTV = Number of transactions x Value of transactions x Probability of purchase

There are some gotchas in this way of modeling the problem. First of all, as discussed earlier, what is value? Is it revenue or profit or quantity sold? Does a certain feature of a product increase the value of a transaction? 

The value should be something that adheres to your business strategy and discourages short-term profit seeking and instead fosters long-term customer relationships.

Second, as mentioned earlier, predicting LTV for new customers will require different methods as they do not have a historical record of transactions.

 

Contractual Business

For a contractual business with a subscription model, the LTV calculation will be different as a customer is locked into buying from you for the time of the contract. Also, you can directly observe churn, since the customers who churn won’t re-subscribe. For example, a magazine with a monthly subscription or a streaming service etc. 

For such products, one can calculate the LTV by the expected number of months for which the customer will re-subscribe.

LTV = Survival rate x Value of subscription x Discount rate

The survival rate by month would be the proportion of customers that maintain their subscription. This can be estimated from the data by customer segment using, for example, survival analysis. The value of a subscription could be revenue minus cost of providing the service and minus customer acquisition cost.

Again, your business has to decide what is considered value. Then the discount rate is there because the subscription lasts into the future.

 

Actions and Measures

So you now have an LTV metric that decision makers in your organization are happy with. Now what? Do you just slap it on a dashboard? Do you recalculate the metric once a month and show the evolution of this metric on a dashboard?

Is LTV just another metric that the data analysis team provides to stakeholders and expects them to somehow use it to “drive business results”? Those are fine ideas but they don’t drive action by themselves. 

LTV metric can be used in multiple ways. For example, in marketing one can design treatments by segments and run experiments to see what kind of treatments maximize LTV instead of short-term profit.

The multiplication of probability to react favorably to a designed treatment with LTV is the expected reward. That reward minus the treatment cost gives us the expected business value. Thus, one gets the expected business value of each treatment and can choose the one with the best effect for each customer or customer segment.

Doing this calculation for our entire customer base will give a list of customers for whom to provide a specific treatment that maximizes LTV given our marketing budget. LTV can also be used to move customers from one segment to another.

For pricing, one could estimate how different segments of customers react to different pricing strategies and use price to affect the LTV trajectory of their customer base towards a more optimal LTV. For example, if using dynamic pricing algorithms, the LTV can be taken into account in the reward function.

Internal teams should track KPIs that will have an effect on the LTV calculation over which they have control. For example, in a non-contractual context, the product team can be measured on how well they increase the average number of transactions, or in a contractual context, the number of months that a typical customer stays subscribed.

The support team can be measured on the way that they provide customer service to reduce customer churn. The product development team can be measured on how well they increase the value per transaction by reducing costs or by adding features. The marketing team can be measured on the effectiveness of treatments to customer segments to increase the probability of purchase. 

After all, you get what you measure for. 

 

A Word on Data

LTV models generally aim to predict customer behavior as a function of observed customer features. This means that it is important to collect data about interactions, treatments and behaviors. 

Purchasing behavior is driven by fundamental factors such as valuation of a product or service compared with competing products or services. These factors may or may not be directly measurable but gathering information about competitor prices and actions can be crucial when analyzing customer behavior.

Other important data is created by the interaction between a customer and a brand. These properties characterize the overall customer experience, including customer satisfaction and loyalty scores.

The most important category of data is observed behavioral data. This can be in the form of purchase events, website visits, browsing history, and email clicks. This data often captures interactions with individual products or campaigns at specific points in time. From purchases one can quantify metrics like frequency or recency of purchases. 

Behavioral data carry the most important signals needed for modeling as customer behavior is at the core of our modeling practice for predicting LTV.

The data described above should also be augmented with additional features from your businesses side of the equation, such as catalog data, seasonality, prices, discounts, and store specific information.

 

Prerequisites for Implementing LTV

Thus far in this article we have discussed why LTV is important, we have shown some examples of how to calculate it and then discussed shortly how to make it actionable. Here are some questions that need to be answered before implementing an LTV calculation method:

  • Do we know who our customers are?
  • What is the best measure of value?
  • How to incorporate business strategy into the calculation?
  • Is the product a contractual or non-contractual product?

If you can answer these questions then you can start to implement your first actionable version of LTV.

See a demo here.

 

 

About the author: Janne Flinck is an AI & Data Lead at Codento. Janne joined Codento from Accenture 2022 with extensive experience in Google Cloud Platform, Data Science, and Data Engineering. His interests are in creating and architecting data-intensive applications and tooling. Janne has three professional certifications and one associate certification in Google Cloud and a Master’s Degree in Economics.

 

Please contact us for more information on how to utilize machine learning to optimize your customers’ LTV.

Cloud Digital Leader Certification – Why’s and How’s?

#GOOGLECLOUDJOURNEY: Cloud Digital Leader Certification – Why’s and How’s?

Author: Anthony Gyursanszky, CEO, Codento

 

Foreword

As our technical consultants here at Codento have been busy in completing their professional Google certifications, me and my colleagues in business roles have tried to keep up with the pace by obtaining Google’s sales credentials (which were required for company-level partner status) and studying the basics with Coursera’s Google Cloud Fundamental Courses. While the technical labs in latter courses were interesting and concrete, they were not really needed in our roles, and a small source for frustration.

Then the question arose: what is the the proper way to obtain adequate knowledge of cloud technology and digital transformation from the business perspective as well as to learn latest with Google Cloud products and roadmap?

I have recently learned many of my  colleagues in other ecosystem companies have earned their Google’s Cloud Digital Leader certifications. My curiosity arose: would this be one for me as well?

 

Why to bother in the first place?

In Google’s words “a Cloud Digital Leader is an entry level certification exam and a certified leader can articulate the capabilities of Google Cloud core products and services and how they benefit organizations. The Cloud Digital Leader can also describe common business use cases and how cloud solutions support an enterprise.”

I earlier assumed that this certification covers both Google Cloud and Google Workspace, and especially how the cultural transformation is lead in Workspace area, but this assumption turned out to be completely wrong. There is nothing at all covering Workspace here, it is all about Google Cloud.  This was good news to me as even though we are satisfied Workspace users internally our consultancy business is solely with Google Cloud.

So what does the certificate cover? I would describe the content as follows:

  • Fundamentals of cloud technology impact and opportunities for organizations
  • Different data challenges and opportunities and how cloud and Google Cloud could be of help including ML and AI
  • Various paths how organizations should move to the cloud and how Google Cloud can utilized in modernizing their applications
  • How to design, run and optimize cloud mainly from business and compliance perspective

If these topics are relevant to you and you want to take the certification challenge  Cloud Digital Leader is for you.

 

How to prepare for the exam?

As I moved on with my goal to obtain the actual certification I learned that Google offers free training modules for partners. The full partner technical training catalog is available for partners on Google Cloud Skills Boost for Partners. If you are not a Google Cloud partner the same training is also available free of charge here.

Training modules are of high quality, super clear and easy to follow. There is a student slide deck for each of the four modules with about 70 slides in each. The amount of text and information per slide is limited and it does not take many minutes to go them through.

The actual videos can be run through in a double-speed mode and one requires passing rate of 80% in quizes after each section. Contrary to the actual certification test the quizes turn out to be slightly more difficult as multi-choice answers were also presented.

In my experience, it will take about 4-6 hours to go through the training and to ensure good chances of obtaining the actual certification. So this is far from the extent required to passing  a professional technical certification where we are talking about weeks of effort and plenty of prerequisite knowledge.

 

How to register to a test?

The easiest way is to book online proctored test through Webasessor. The cost is 99 USD plus VAT which you need to pay in advance. There are plenty of  available time slots for remote tests with 15 min intervals basically any weekday. And yes, if you are wondering, the time slots are presented in your local time even though not mentioned anywhere.

How to complete the online test? There are few prerequisites before the test:

  • Room where you can work in privacy 
  • Your table needs to clean
  • IDs to be available
  • You need to install secure browser and upload your photo in advance (minimum 24h as I learned)
  • Other instructions as in registration process

The exam link will appear at Webassessor site few minutes before the scheduled slot. Then you will be first waiting 5-15 minutes in a lobby and then guided through few steps like showing your ID and showing your room and table with your web camera. This part will take some 5-10 minutes.

After you enroll the test, the timer will be shown throughout the exam. While the maximum time is 90 minutes it will likely take only some 30 minutes to answer all 50-60 questions. The questions are pretty short and simple. Four alternatives are proposed and only one is correct. If you hesitate between two possible correct answers (as it happened to me few times) you can come back to them in the end. Some sources on web indicate that 70% of questions need to be answered correctly.

Once you submit your answers you will be immediately notified whether you pass or not. No information of grades or right/wrong answers will be provided though. Google will come back to you with an actual certification letter in a few business days. A possible new test  can be scheduled earliest in 14 days.

 

Was it worthwhile – my few cents

A Cloud Digital Leader certification is not counted as a professional certification and included to any of the company level partner statuses or specializations. This  might, however,  change in the future.

I would assume that Google has the following objectives for this certification:

  • To provide role-independant enrty certifications, also for general management,  as in other ecoystems (Azure / AWS Fundamentals) 
  • To bring Google Cloud ecosystem better together with proper common language and vision including partners, developers, Google employees and customer decision makers
  • To align business and technical people to work better together to speak the same language and understand high level concepts in the same way
  • To provide basic sales training to wider audience so that sales people can feel ”certified” like technical people

The certification is valid for thee years, but while the basic principle will apply in the future, the Google Cloud product knowledge will become obsolete pretty quickly. 

Was it worth it? For me definitely yes. I practiclally went through the material in one afternoon and booked a cert test for the next morning so not too much time spent in vain. But as I am already sort-of a cloud veteran and Google Cloud advocate I would assume that this would be more a valuable eye-opener for AWS/Azure lovers who have not yet understood the broad potential of Google Cloud. Thumbs up also for all of us business people in Google ecosystem – this is a must entry point to work in our ecosystem.

 

 

About the author:

Anthony Gyursanszky, CEO, joined Codento in late 2019 with more than 30 years of experience in the IT and software industry. Anthony has previously held management positions at F-Secure, SSH, Knowit / Endero, Microsoft Finland, Tellabs, Innofactor and Elisa. Gyursanszky has also served on the boards of software companies, including Arc Technology and Creanord. Anthony also works as a senior consultant for Value Mapping Services. Anthony’s experience covers business management, product management, product development, software business, SaaS business, process management and software development outsourcing. And now Anthony is also a certified Cloud Digital Leader.

 

 

Contact us for more information about Codento services:

Codento Community Blog: Six Pitfalls of Digitalization – and How to Avoid Them

Codento Community Blog: Six Pitfalls of Digitalization – and How to Avoid Them

By Codento consultants

 

Introduction

We at Codento have been working hard over the last few months on various digitization projects as consultants and have faced dozens of different customer situations. At the same time, we have stopped to see how much of the same pitfalls are encountered at these sites that could have been avoided in advance.

The life mission of a consulting firm like Codento is likely to provide a two-pronged vision for our clients: to replicate the successes generally observed and, on the other hand, to avoid pitfalls.

Drifting into avoidable repetitive pitfalls always causes a lot of disappointment and frustration, so we stopped against the entire Codento team of consultants to reflect and put together our own ideas, especially to avoid these pitfalls.

A lively and multifaceted communal exchange of ideas was born, which, based on our own experience and vision, was condensed into six root causes and wholes:

  1. Let’s start by solving the wrong problem
  2. Remaining bound to existing applications and infrastructure
  3. Being stuck with the current operating models and processes
  4. The potential of new cloud technologies is not being optimally exploited
  5. Data is not sufficiently utilized in business
  6. The utilization of machine learning and artificial intelligence does not lead to a competitive advantage

Next, we will go through this interesting dialogue with Codento consultants.

 

Pitfall 1: Let’s start by solving the originally wrong problem

How many Design Sprints and MVPs in the world have been implemented to create new solutions in such a way that the original problem setting and customer needs were based on false assumptions or otherwise incomplete?

Or that many problems more valuable to the business have remained unresolved when they are left in the backlog? Choosing a technology between a manufactured product or custom software, for example, is often the easiest step.

There is nothing wrong with the Design Sprint or Minimum Viable Product methodology per se: they are very well suited to uncertainty and an experimental approach and to avoid unnecessary productive work, but there is certainly room for improvement in what problems they apply to.

Veera also recalls one situation: “Let’s start solving the problem in an MVP-minded way without thinking very far about how the app should work in different use cases. The application can become a collection of different special cases and the connecting factor between them is missing. Later, major renovations may be required when the original architecture or data model does not go far enough. ”

Markku smoothly lists the typical problems associated with the conceptualization and MVP phase: “A certain rigidity in rapid and continuous experimentation, a tendency to perfection, a misunderstanding of the end customer, the wrong technology or operating model.”

“My own solution is always to reduce the definition of a problem to such a small sub-problem that it is faster to solve and more effective to learn. At the same time, the positive mood grows when something visible is always achieved, ”adds Anthony.

Toni sees three essential steps as a solution: “A lot of different problem candidates are needed. One of them will be selected for clarification on the basis of common criteria. Work on problem definition both extensively and deeply. Only then should you go to Design Sprint. ”

 

Pitfall 2: Trapped with existing applications and infrastructure

It’s easy in “greenfield” projects where the “table is clean,” but what to do when the dusty application and IT environment of the years is an obstacle to ambitious digital vision?

Olli-Pekka starts: “Software is not ready until it is taken out of production. Until then, more or less money will sink in, which would be nice to get back, either in terms of working time saved, or just as income. If the systems in production are not kept on track, then the costs that will sink into them are guaranteed to surpass the benefits sooner or later. This is due to inflation and the exponential development of technology. ”

“A really old system that supports a company’s business and is virtually impossible to replace,” continues Jari T. “The low turnover and technology age of it means that the system is not worth replacing. The system will be shut down as soon as the last parts of the business have been phased out. ”

“A monolithic system comes to mind that cannot be renewed part by part. Renewing the entire system would be too much of a cost, ”adds Veera.

Olli-Pekka outlines three different situations: “Depending on the user base, the pressures for modernization are different, but the need for it will not disappear at any stage. Let’s take a few examples.

Consumer products – There is no market for antiques in this industry unless your business is based on the sale of NFTs from Doom’s original source code, and even then. Or when was the last time you admired Win-XP CDs on a store shelf?

Business products – a slightly more complicated case. The point here is that in order for the system you use to be relevant to your business, it needs to play kindly with other systems your organization uses. Otherwise, a replacement will be drawn for it, because manual steps in the process are both expensive and error-prone. However, there is no problem if no one updates their products. I would not lull myself into this.

Internal use – no need to modernize? All you have to do here is train yourself to replace the new ones, because no one else is doing it to your stack anymore. Also, remember to hope that not everyone who manages to entice you into this technological impasse will come up with a peek over the fence. And also remember to set aside a little extra funds for maintenance contracts, as outside vendors may raise their prices when the number of users for their sunset products drops. ”

A few concepts immediately came to mind by Iiro: “Path dependency and Sunk cost fallacy. Could one write own blog about both of them? ”

“What are the reasons or inconveniences for different studies?” ask Sami and Marika.

“I have at least remembered the budgetary challenges, the complexity of the environments, the lack of integration capacity, data security and legislation. So what would be the solution? ”Anthony answers at the same time.

Olli-Pekka’s three ideas emerge quickly: “Map your system – you should also use external pairs of eyes for this, because they know how to identify even the details that your own eye is already used to. An external expert can also ask the right questions and fish for the answers. Plan your route out of the trap – less often you should rush blindly in every direction at the same time. It is enough to pierce the opening where the fence is weakest. From here you can then start expanding and building new pastures at a pace that suits you. Invest in know-how – the easiest way to make a hole in a fence is with the right tools. And a skilled worker will pierce the opening so that it will continue to be easy to pass through without tearing his clothes. It is not worth lulling yourself to find this factor inside the house, because if that were the case, that opening would already be in it. Or the process rots. In any case, help is needed. ”

 

Pitfall 3: Remaining captive to current policies

“Which is the bigger obstacle in the end: infrastructure and applications or our own operating models and lack of capacity for change?”, Tommi ponders.

“I would be leaning towards operating models myself,” Samuel sees. “I am strongly reminded of the silo between business and IT, the high level of risk aversion, the lack of resilience, the vagueness of the guiding digital vision, and the lack of vision.”

Veera adds, “Let’s start modeling old processes as they are for a new application, instead of thinking about how to change the processes and benefit from better processes at the same time.”

Elmo immediately lists a few practical examples: “Word + Sharepoint documentation is limiting because “this is always the case”. Resistance to change means that modern practices and the latest tools cannot be used, thereby excluding some of the contribution from being made. This limits the user base, as it is not possible to use the organisation’s cross-border expertise. ”

Anne continues: “Excel + word documentation models result in information that is widespread and difficult to maintain. The flow of information by e-mail. The biggest obstacle is culture and the way we do it, not the technology itself. ”

“What should I do and where can I get motivation?” Perttu ponders and continues with the proposed solution: “Small profits quickly – low-hanging-fruits should be picked. The longer the inefficient operation lasts, the more expensive it is to get out of there. Sunk Cost Fallacy could be loosely combined with this. ”

“There are limitless areas to improve.” Markku opens a range of options: “Business collaboration, product management, application development, DevOps, testing, integration, outsourcing, further development, management, resourcing, subcontracting, tools, processes, documentation, metrics. There is no need to be world-class in everything, but it is good to improve the area or areas that have the greatest impact with optimal investment. ”

 

Pitfall 4: The potential of new cloud technologies is not being exploited

Google Cloud, Azure, AWS or multi-cloud? Is this the most important question?

Markku answers: “I don’t think so. The indicators of financial control move cloud costs away from the depreciation side directly higher up the lines of the income statement, and the target setting of many companies does not bend to this, although in reality it would have a much positive effect on cash flow in the long run. ”

Sanna comes to mind a few new situations: “Choose the technology that is believed to best suit your needs. This is because there is not enough comprehensive knowledge and experience about existing technologies and their potential. Therefore, one may end up with a situation where a lot of logic and features have already been built on top of the chosen technology when it is found that another model would have been better suited to the use case. Real-life experience: “With these functions, this can be done quickly”, two years later: “Why wasn’t the IoT hub chosen?”

Perttu emphasizes: “The use of digital platforms at work (eg drive, meet, teams, etc.) can be found closer to everyday business than in the cold and technical core of cloud technology. Especially as the public debate has recently revolved around the guidelines of a few big companies instructing employees to return to local work. ”

Perttu continues: “Compared to this, the services offered by digital platforms make operations more agile and enable a wider range of lifestyles, as well as streamlining business operations. It must be remembered, of course, that physical encounters are also important to people, but it could be assumed that experts in any field are best at defining effective ways of working themselves. Win-win, right? ”

So what’s the solution?

“I think the most important thing is that the features to be deployed in the cloud capabilities are adapted to the selected short- and long-term use cases,” concludes Markku.

 

Pitfall 5: Data is not sufficiently utilized in business

Aren’t there just companies that can avoid having the bulk of their data in good possession and integrity? But what are the different challenges involved?

Aleksi explains: “The practical obstacle to the wider use of data in an organization is quite often the poor visibility of the available data. There may be many hidden data sets whose existence is known to only a couple of people. These may only be found by chance by talking to the right people.

Another similar problem is that for some data sets, the content, structure, origin or mode of origin of the data is no longer really known – and there is little documentation of it. ”

Aleksi continues, “An overly absolute and early-applied business case approach prevents data from being exploited in experiments and development involving a“ research aspect ”. This is the case, for example, in many new cases of machine learning: it is not clear in advance what can be expected, or even if anything usable can be achieved. Thus, such early action is difficult to justify using a normal business case.

It could be better to assess the potential benefits that the approach could have if successful. If these benefits are large enough, you can start experimenting, look at the situation constantly, and snatch ideas that turn out to be bad quickly. The time of the business case may be later. ”

 

Pitfall 6: The use of machine learning and artificial intelligence will not lead to a competitive advantage

It seems to be fashionable in modern times for a business manager to attend various machine learning courses and a varying number of experiments are underway in organizations. However, it is not very far yet, is it?

Aleksi opens his experiences: “Over time, the current“ traditional ”approach has been filed quite well, and there is very little potential for improvement. The first experiments in machine learning do not produce a better result than at present, so it is decided to stop examining and developing them. In many cases, however, the situation may be that the potential of the current operating model has been almost completely exhausted over time, while on the machine learning side the potential for improvement would reach a much higher level. It is as if we are locked in the current way only because the first attempts did not immediately bring about improvement. ”

Anthony summarizes the challenges into three components: “Business value is unclear, data is not available and there is not enough expertise to utilize machine learning.”

Jari R. wants to promote his own previous speech at the spring business-oriented online machine learning event. “If I remember correctly, I have compiled a list of as many as ten pitfalls suitable for this topic. In this event material, they are easy to read:

  1. The specific business problem is not properly defined.
  2. No target is defined for model reliability or the target is unrealistic.
  3. The choice of data sources is left to data scientists and engineers and the expertise of the business area’s experts is not utilized.
  4. The ML project is carried out exclusively by the IT department itself. Experts from the business area will not be involved in the project.
  5. The data needed to build and utilize the model is considered fragmented across different systems, and cloud platform data solutions are not utilized.
  6. The retraining of the model in the cloud platform is not taken into account already in the development phase.
  7. The most fashionable algorithms are chosen for the model. The appropriateness of the algorithms is not considered.
  8. The root causes of the errors made by the model are not analyzed but blindly rely on statistical accuracy parameters.
  9. The model will be built to run on Data Scientist’s own machine and its portability to the cloud platform will not be considered during the development phase.
  10. The ability of the model to analyze real business data is not systematically monitored and the model is not retrained. ”

This would serve as a good example of the thoroughness of our data scientists. It is easy to agree with that list and believe that we at Codento have a vision for avoiding pitfalls in this area as well.

 

Summary – Avoid pitfalls in a timely manner

To prevent you from falling into the pitfalls, Codento consultants have promised to offer two-hour free workshops to willing organizations, always focusing on one of these pitfalls at a time:

  1. Digital Value Workshop: Clarified and understandable business problem to be solved in the concept phase
  2. Application Renewal Workshop: A prioritized roadmap for modernizing applications
  3. Process Workshop: Identifying potential policy challenges for the evaluation phase
  4. Cloud Architecture Workshop: Helps identify concrete steps toward high-quality cloud architecture and its further development
  5. Data Architecture Workshop: Preliminary current situation of data architecture and potential developments for further design
  6. Artificial Intelligence Workshop: Prioritized use case descriptions for more detailed planning from a business feasibility perspective

Ask us for more information and we will make an appointment for August, so the autumn will start comfortably, avoiding the pitfalls.

 

Piloting Machine Learning at Speed – Utilizing Google Cloud and AutoML

Piloting machine learning at speed – Utilizing Google Cloud and AutoML

 

Can modern machine learning tools do one-weeks work in an afternoon? The development of machine learning models has traditionally been a very iterative process. The traditional machine learning project starts with the selection and pre-processing of data sets: cleaning and pre-processing. Only then can the actual development work of the machine learning model be started.

It is very rare, virtually impossible, for a new machine learning model to be able to make sufficiently good predictions on the first try. Indeed, development work traditionally involves a significant number of failures both in the selection of algorithms and their fine-tuning, in technical language in the tuning of hyperparameters.

All of this requires working time, in other words, money. What if, after cleaning the data, all the steps of development could be automated? What if the development project could be carried through at an over-paced sprint per day?

 

Machine learning and automation

In recent years, the automation of building machine learning models (AutoML) has taken significant leaps. Roughly described in traditional machine learning, the Data Scientist builds a machine learning model and trains it with a large dataset. AutoML, on the other hand, is a relatively new approach in which the machine learning model builds and trains itself using a large dataset.

All the Data Scientist needs to do is tell you what the problem is. This can be a problem with machine vision, pricing or text analysis, for example. However, Data Scientists will not be unemployed due to AutoML models. The workload shifts from fine-tuning the model to validating and using Explainable-AI tools.

 

Google Cloud and AutoML used to sole a practical challenge

Some time ago, we at Codento tested Google Cloud AutoML-based machine learning tools [1]. Our goal was to find out how well Google Cloud AutoML tool solves the Kaggle House Prices – Advanced Regression Techniques challenge [2].

The goal of the challenge is to build the most accurate tool possible to predict the selling prices of real estates based on their properties. The data set used in the building of the pricing model contained data on approximately 1,400 real estates: In total 80 different parameters that could potentially affect the price, as well as their actual sales prices. Some of the parameters were numerical, some were categorical.

 

Building a model in practice

The data used was pre-cleaned. The first phase of building the machine learning model was thus completed. First, the data set, a file in csv format, was uploaded as is to Google Cloud BigQuery data warehouse. The download took advantage of BigQuery’s ability to identify the database schema directly from the file structure. The AutoML Tabular feature found in the VertexAI tool was used to build the actual model.

After some clicking, the tool was told which of the price predictive parameters were numeric and which were categorical variables. In addition, the tool was told which column contains the predicted parameter. It all took about an hour to work. After that, the training was started and we started waiting for the results. About 2.5 hours later, the Google Cloud robot sent an email stating that the model was ready.

 

The final result was a positive surprise

The accuracy of the model created by AutoML surprised the developers. Google Cloud AutoML was able to independently build a pricing model that predicts home prices with approximately 90% accuracy. The level of accuracy per se does not differ from the general level of accuracy of pricing models. It is noteworthy here, however, that the development of this model took a total of half a working day.

However, the benefits of GCP AutoML do not end there. It would be possible to integrate this model with very little effort into the Google Cloud data pipeline. The model could also be loaded as a container and deployed in other cloud platforms.

 

Approach which pays off in the future as well

For good reason, tools based on AutoML can be considered the latest major development in machine learning. Thanks to the tools, the development of an individual machine learning model no longer has to be thought of as a project or an investment. Utilizing the full potential of these tools, models can be built with an approximately zero budget. New forecasting models based on machine learning can be built almost on a whim

However, the effective deployment of AutoML tools requires a significant initial investment. The entire data infrastructure, data warehouses and lakes, data pipelines, and visualization layers, must first be built with cloud-native tools. Codento’s certified cloud architects and data engineers can help with these challenges.

 

Sources:

Google Cloud AutoML, https://cloud.google.com/automl/ 

Kaggle, House Prices – Advanced Regression Techniques, https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/

 

The author of the article is Jari Rinta-aho, Senior Data Scientist & Consultant, Codento. Jari is a consultant and physicist interested in machine learning and mathematics, with extensive experience in utilizing machine learning in nuclear energy. He has also taught physics at several universities and led international research projects. Jari’s interests include ML-Ops, AutoML, Explainable AI and Industry 4.0.

 

Ask more about Codento’s AI and data services:

Certificates Create Purpose

#GCPJOURNEY, Certificates Create Purpose

Author: Jari Timonen, Codento Oy

What are IT certifications?

Personal certifications provide an opportunity for IT service companies to describe the level and scope of expertise of their own consultants. For an IT service provider, certifications, at least in theory, guarantee that a person knows their stuff.

The certificate test is performed under controlled conditions and usually includes multiple-choice questions. In addition, there are also task-based exams on the market, in which case the required assignment is done freely at home or at work.

There are many levels of certifications for different target groups. Usually they are hierarchical, so you can start with a completely foreign topic from the easiest way. At the highest level are the most difficult and most respected certificates.

At Codento, personal certifications are an integral part of self-development. They are one measure of competence. We support the completion of certificates by enabling you to spend your working time studying and by paying for the courses and the exam itself. Google’s selection has the right level and subject matter certification for everyone to complete.

An up-to-date list of certifications can be found on the Google Cloud website.

Purposefulness at the center

Executing certificates for the sake of “posters” alone is not a very sensible approach. Achieving certifications should be seen as a goal to be read structurally when studying. This means that there is some red thread in self-development to follow.

The goal may be to complete only one certificate or, for example, a planned path through three different levels. This way, self-development is much easier than reading an article here and there without a goal.

Schedule as a basis for commitment

After setting the goal, a schedule for the exam should be chosen. This really varies a lot depending on the entry level and the certification to be performed. If you already have existing knowledge, reading may be a mere recap. Generally speaking, a few months should be set aside for reading. In the longer term, studying will be more memorable and thus more useful.

Test exams should be taken from time to time. They help to determine which part of the experiment should be read more and which areas are already in possession. Test exams should be done in the early stages of reading, even if the result is poor. This is how you gain experience for the actual exam and the questions in the exam don’t come as a complete surprise.

The exam should be booked approximately 3-4 weeks before the scheduled completion date. During this time, you have time to take enough test exams and strengthen your skills.

Reading both at work and in your free time

It is a good idea to start reading by understanding the test area. This means finding out the different emphases of the experiment and listing things. It is a good idea to make a rough plan for reading, scheduled according to different areas

After the plan, you can start studying one topic at a time. Topics can be approached from top to bottom, that is, first try to understand the whole, then go into the details. One of the most important tools for cloud service certifications in learning is doing. Things should be done by yourself, and not just read from books. The memory footprint is much stronger when you get to experiment with how the services work yourself.

Reading and doing should be done both at work and in your free time. It is usually a good idea to set aside time in your calendar to study. The same should be scheduled for leisure, if possible. In this case, the study must be done with a higher probability.

Studying regularly is worth it

Over the years, I have completed several different certifications in various subject areas: Sun Microsystems, Oracle, AWS, and GCP. In all of these, your own passion and desire to learn is decisive. The previous certifications always provide a basis for the next one, so reading becomes easier over time. For example, if you have completed AWS Architect certifications, you can use them to work on the corresponding Google Cloud certifications. The technologies are different, but there is little difference in architecture because cloud-native architecture is not cloud-dependent.

The most important thing I’ve learned: Study regularly and one thing at a time.

Concluding remarks: Certificates and hands-on experience together guarantee success

Certificates are useful tools for self-development. They do not yet guarantee full competence, but provide a good basis for striving to become a professional. Certification combined with everyday life is one of the strongest ways to learn about modern cloud services that benefit everyone – employee, employer and customer – regardless of skill level.

The author of the blog, Jari Timonen, is an experienced software professional with more than 20 years of experience in the IT field. Jari’s passion is to build bridges between the business and the technical teams, where he has worked in his previous position at Cargotec, for example. At Codento, he is at his element in piloting customers towards future-compatible cloud and hybrid cloud environments.

Business-driven Machine Learner with Google Cloud

Business-driven Machine Learner with Google Cloud: Multilingual Customer Feedback Classifier

Author: Jari Rinta-aho, Codento

At Codento, we have rapidly expanded our services to demanding implementations and services for data and machine learning. When discussing with our customers, the following business goals and expectations have often come to the fore:

  • Disclosure of hidden regularities in data
  • Automation of analysis
  • Minimizing human error
  • New business models and opportunities
  • Improving and safeguarding competitiveness
  • Processing of multidimensional and versatile data material

In this blog post, I will  go through the lessons from our recent customer case.

Competitive advantage from deep understanding customer feedback

A very concrete business need arose this spring for a Finnish B-to-C player: huge amounts of customer feedback data come, but how to utilize feedback intelligently in decision-making to make the right business decisions.

Codento recommended the use of machine learning

Codento’s recommendation was to take advantage of the challenging machine learning approach and Google Cloud off-the-shelf features to get the customer feedback classifier ready by the week.

The goal was to automatically classify short Customer Feedback into three baskets: Positive, Neutral, and Negative. Customer feedback was mainly short Finnish texts. However, there were also a few texts written in Swedish and English. The classifier must therefore also be able to recognize the language of the source text automatically.

Can you really expect results in a week?

At the same time, the project was tight on schedule and ambitious. There was no time to waste in the project, but in practice the results had to be obtained on the first try. Codento therefore decided to make the most of the ready-made cognitive services.

Google Cloud plays a key role

It was decided to implement the classifier by combining two ready-made tools found in the Google Cloud Platform: Translate API and Natural Language API. The purpose was to mechanically translate the texts into English and determine their tone. Because the Translate API is able to automatically detect the source language from about a hundred different languages, the tool met the requirements, at least on paper.

Were the results useful?

Random sampling and craftsmanship were used to validate the results. From the existing data, 150 texts were selected at random for the validation of the classifier. First, these texts were sorted by hand into three categories: positive, neutral, and negative. After that, the same classification was made with the tool we developed. In the end, the results of the tool and the craft were compared.

What was achieved?

The tool and the analyzer agreed on about 80% of the feedback. There was no contrary view. The validation results were pooled into a confusion matrix.

The numbers 18, 30, and 75 on the diagonal of the image confusion matrix describe the feedback in which the Validator and the tool agreed on the tone of the feedback. A total of 11 feedbacks were those in which Validator considered the tone positive but the tool neutral.

 

The most significant factor that explains the different interpretation made by the tool is the cultural relevance of the wording of the customer feedback, and when a Finn says “No complaining”, he praises.

Heard from an American, this is neutral feedback. This cultural difference alone is sufficient to explain why the largest single error group was “positive in the view of the validator, neutral in the view of the tool.” Otherwise, the error is explained by the difficulty of distinguishing between borderline cases. It is impossible to say unambiguously when slightly positive feedback will turn neutral and vice versa.

Utilizing the solution in business

The data-validated approach was well suited to solve the challenge and is an excellent starting point for understanding the nature of feedback in the future, developing further models for more detailed analysis, speeding up analysis and reducing manual work. The solution can also be applied to a wide range of similar situations and needs in other processes or industries.

The author of the article is Jari Rinta-aho, Senior Data Scientist & Consultant, Codento. Jari is a consultant and physicist interested in machine learning and mathematics, who has extensive experience in utilizing machine learning, e.g. nuclear technologies. He has also taught physics at the university and led international research projects.