Tools such as “Big Data,” “Artificial Intelligence,” and “Machine Learning” resonate as fundamental parts of the transformation businesses are undergoing with technology. However, these are not an end in themselves; rather, they are tools to achieve a goal. Their value is not defined by simply having them, but by the ability to use them to solve business problems, thereby allowing organizations to capture greater value. 

At the same time, effectively using these tools requires more than just having data and adding it to a model to generate a solution: it necessitates a structured process based on the scientific method that includes identifying a business problem/opportunity, defining improvement hypotheses, and considering how the available data can help validate or reject those hypotheses. Given this, we have found it crucial for organizational leaders to understand these processes to effectively integrate these tools into their companies, making them an effective aid in increasing productivity. 

Below, we present a recent application case that reflects the process of application and its main challenges. 

Applied Case: Data Analytics in a Healthcare Organization 

In 2020, due to the social unrest and the pandemic, a healthcare organization faced high uncertainty regarding its funding channels. Recently, it had developed efficiency measures in its back office, which, while helping it to be better positioned to face this situation, were not sufficient in case of a significant drop in resources. The organization’s top executives then asked: Are there opportunities for efficiency in the treatments we provide? Is it possible to capture them without losing effectiveness? 

To address these questions, they faced an additional problem: due to its geographical dispersion with over 10 health centers across the country and the sanitary restrictions limiting inter-regional travel, process mapping to identify improvement opportunities could not be conducted physically. Therefore, our proposal was to tackle the problem through data. Historically, treatments were defined by medical staff based on their experience and personal judgment. Our hypothesis was that within the large dataset of treatments, we could identify some that were more efficient and effective than others, which would improve the institution’s performance if offered as standard treatments. 

To address this problem, we defined a 5-step process, following the scientific method: 

First Step: Identify How to Measure the Success of a Medical Treatment 

We started by identifying how to measure the success of a medical treatment since to determine if a treatment had been effective, we needed an indicator that would allow us to know. In other words, we began by defining how to answer the key business question: Did medical treatment achieve its objective? 

Second Step: Explore the Data and Identify Trends 

We then analyzed the available information about the more than 18,000 treatments carried out by the institution. This allowed us to establish that a medical treatment is composed of a set of therapeutic activities such as surgery, physiotherapy sessions, and occupational therapy sessions. We also identified that, although there were over 350 distinct medical therapeutic activities, not all had a direct impact on patients; for example, a radiographic exam does not directly improve the patient as it is a diagnostic activity. Additionally, we identified the direct cost of each activity, assigning costs for materials, labor, and use of facilities to each one. 

From this data exploration, we also identified other key factors to consider for constructing a model and analyzing results. One key factor identified was that the final outcomes of medical treatments are influenced by variables other than the therapeutic activities they comprise. For example, the progress of patients with the same medical diagnosis and severity is influenced by variables such as the healthcare center where the patient is treated and the socioeconomic group to which the patient belongs. This was very relevant for the final analysis to isolate these effects to avoid drawing incorrect conclusions. 

Third Step: Select Tools and Build the Model 

Once we understood the business question to be solved and categorized the available data, we built a model. For this, we identified three sub-stages in the analysis, each requiring specific tools: 

  1. Identify clusters of similar patients to study the impact of medical treatments on a homogeneous population. For this stage, we selected the “K-Means” tool, a technique that allows clustering groups into sub-groups composed of homogeneous members. 
  2. Identify treatment options (sets of activities) for each patient cluster that maximize the chances of achieving objectives and are cost-efficient. For this, we chose a model composed of “Support Vector Machine,” a supervised Machine Learning technique that allows learning from existing treatments and identifying treatments at the efficiency frontier between benefit and cost. Additionally, we applied linear regression, an econometric technique that helps identify degrees of correlation between variables. 
  3. Choose the treatments with the most medical sense for each diagnosis from the options of effective and efficient treatments detected in the previous sub-stage. At this point, we decided to use a panel of experts as the tool, i.e., a group of selected medical experts based on diagnoses, to choose the recommended treatment among the identified options. 

Fourth Step: Adjust Data and Apply the Model 

Next, we structured the data to fit the format required by the tools defined in the previous step, ensuring that they would provide the expected results. This step was crucial to guarantee effective conclusions. 

With the data prepared, we applied the three-stage model, from which we obtained a base treatment for each patient cluster with the same diagnosis, severity, and age group

Fifth Step: Identify Impact and Potential for Application 

Finally, as a result, we identified that implementing the optimal treatments (focusing on activities with the highest aggregated benefit )would free up 12% of the resources annually allocated to therapeutic activities,. This means that the healthcare center could serve an additional 3,000 clients in a year. For instance, applying the model to a patient cluster with mild diagnoses means the institution would reduce from prescribing 28,500 hours annually across 17 different therapeutic activities to 18,600 hours in just 2 therapeutic activities. 

Given these results, together with the medical team and the main executives of the organization, we concluded that it made sense to use the base treatments identified for each cluster by providing recommendations to each doctor on a reference treatment for a patient based on their diagnosis. This approach allows for providing guidance to doctors in defining the treatments they prescribe, thereby increasing the efficiency and effectiveness of these treatments.  

Key Learnings 

This case demonstrates that there is value in applying technological data analysis tools. However, while they are highly useful, they are not “plug and play” tools; they must be accompanied by and guided by a business vision that allows for correctly defining the problems faced by the organization and a rigorous methodology based on the scientific method that guides the definition and subsequent validation or refutation of hypotheses. 

Furthermore, their application requires management capability, particularly change management. This is because the analyses provide opportunities for improvement, but they do not add value if they do not inform decision-making and the implementation of new solutions within the organization. 

Therefore, executives in an organization must play a fundamental role in the process of incorporating technological tools as guides for solving business problems and enablers of change to implement the results obtained from their use.