raygbutler
- 2 min read

Advanced Characterization of Experimental Groups using Automated Exploratory Data Analysis

Characterization of Experimental Groups using Exploratory Data Analysis

The characterization of experimental subject groups is a very common analysis process in research work. Its goal is to identify significant differences between the different population groups in order to better understand the impact that a given treatment has on them.

Exploratory Data Analysis techniques (EDA) are a great ally on cases in which the need to work with a high number of factors and variables with complex interrelationship amongst them is required.

Today I will explain you how to use AutoDiscovery in order to perform such analysis in a very agile and effective way.

Static and dynamic differences

Despite there are many ways to categorize the possible differences between experimental groups, my experience in Butler Scientifics makes me think that a good approximation could be to do so taking the degree of interaction between such differences as the framework.

We may thus talk about:

Static differences, which would be those coming from average values significantly different in a given subset of variables.
Dynamic differences, which would be those coming from potential cause-effect relationships (hence the term “dynamic”) that may happen in exclusivity in the experimental groups.

Identifying the static differences

The way they were defined, the static differences are obtained applying the analysis of variances (ANOVA or similar) among the grouping variables and all possible associated responses (qualitative factors).

Such calculation is automatic and inherent to the discovery process of AutoDiscovery and so, in order to obtain the results we seek, the only requirement would be to:

Configure the grouping variables used in the original experiment
Initiate a discovery process with the adequate depth level
Use HypoBooster in order to browse among the relationships found between grouping variables and possible responses

The list of the obtained relationships let us extract all those responses which are significantly different according to the groups.

Identifying the dynamic differences

In order to complement the static differences report, we shall now look for the exclusive interactions suggested for each experimental group.

First step, we will take advantage of the relevance analysis that AutoDiscovery does automatically.

The list of identified relationships in step number 2 of the prior process may be grouped according to their relevance level.

Those classified as “Most relevant” are precisely the ones that seem to be involved in each experimental group in an “exclusivity” basis.

List of relationships grouped by relevance

The "Filter" columns in this table shows the experimental group in which each relationship may be involved in exclusivity. I recommend to export the table to Excel to extract the list of exclusive relationships in each group easily.

In case a more detailed exploration was required (for example, relationships in which certain variables are involved) then we may use the relevance grouping along the HypoBooster.

Do you wish to analyze the differences between your experimental groups?

As you can see, to perform this analysis is simple and fast with AutoDiscovery. You just need to download the software, install it and use it with your own data.

Alternatively, should you wish to have us handle your data and hand you a complete report, we will be happy to help!