Squeezing the Knowledge Out of Your Complex Dataset in 3 Steps
Wednesday, October 29, 2014
The term “complexity”, in general usage, tends to be used to characterize something with many parts in intricate arrangement. In research, examples of extremely complex systems would be the human brain, the biochemical basis of cancer or Asiatic tropical storms.
Do you want to learn how to squeeze the knowledge out of your complex data sets only in 3 steps with AutoDiscovery? Read more ...
Reverse engineering and researching
Most researchers agree that data complexity and its relationships is truly the biggest challenge at all scales and in most applications. Consequently, any dataset (whether large or small) that has hundreds or thousands (or more) dimensions per data item is difficult to explore, mine, and interpret.
One of the most efficient methods to better understanding the functioning of a complex system is the called “reverse engineering”, that is, the process of extracting knowledge or design information from anything and analyzing its components and workings in detail.
In many occasions the goal of this process is that of generating the technical documentation explaining in a simple and comprehensive way how does the analyzed system work.
A typical example of reverse engineering takes place when an engineer takes a mechanical component between his hands and, starting from there, draws its related technical drawing in order to better “understand”, improve and ultimately get the component into mass production.
In a way, many scientific research methodologies (such as observational or experimental studies) have a strong reverse engineering component when they try, for example, to draw inferences about the possible effect of a treatment on subjects. However, in most cases the case under study can only be decompounded by means of the data that’s been gathered during its operation.
AutoDiscoveryis an excellent tool to generate that sort of documentation based on data and, of course, to exploit it afterwards in case that was the goal in the first place.
Let the data tell the story
With the goal to describe a complex system, AutoDiscovery gets started with the project related data acquisition and proposes a sequence of 3 steps:
For example, if we studied the new neurons generation process in our brain (neurogenesis), we could aggregate the information from the visual inspection carried out by means of a microscope, the information related to the learning/memory subsystem and that related with the anxiety neurobiological subsystem.
Results are automatically sorted out based on their scientific relevance so that you can focus in the most important part of the story first.
This process may be configured so that the description of the system is framed to a specific section of that system, in order to avoid obtaining a too complex to manipulate documentation.
AutoDiscovery provides with two very efficient tools meant to facilitate the processess of comprehension, description and documentation of the system.
The Discovery Map is an interactive graph that facilitates navigating and exploring the list of relationships detected between data sources and variables.
The Hypo Booster tool complements the interactive Discovery Map to refine the exploration on the most relevant relationships for you.
Thanks to these two tools and the possibility to export all the details in well-known files formats (images and tables), it is very easy to understand and explain which variables intervene in the behavior of the system, what kind of relationships there are among them, and in which particular circumstances those relationships take place.