{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Dataset from a cache\n\nIn this example, we will see how to build a :class:`.Dataset` from objects\nof an :class:`.AbstractFullCache`.\nFor that, we need to import this :class:`.Dataset` class:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from __future__ import annotations\n\nfrom gemseo.api import configure_logger\nfrom gemseo.caches.memory_full_cache import MemoryFullCache\nfrom numpy import array\n\nconfigure_logger()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Synthetic data\nLet us consider a :class:`.MemoryFullCache` storing two parameters:\n\n- x with dimension 1 which is a cache input,\n- y with dimension 2 which is a cache output.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "cache = MemoryFullCache()\ncache[{\"x\": array([1.0])}] = ({\"y\": array([2.0, 3.0])}, None)\ncache[{\"x\": array([4.0])}] = ({\"y\": array([5.0, 6.0])}, None)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create a dataset\nWe can easily build a dataset from this :class:`.MemoryFullCache`,\neither by separating the inputs from the outputs (default option):\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "dataset = cache.export_to_dataset(\"toy_cache\")\nprint(dataset)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "or by considering all features as default parameters:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "dataset = cache.export_to_dataset(\"toy_cache\", categorize=False)\nprint(dataset)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Access properties\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "dataset = cache.export_to_dataset(\"toy_cache\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Variables names\nWe can access the variables names:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.variables)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Variables sizes\nWe can access the variables sizes:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.sizes)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Variables groups\nWe can access the variables groups:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.groups)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Access data\nAccess by group\n~~~~~~~~~~~~~~~\nWe can get the data by group, either as an array (default option):\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_data_by_group(\"inputs\"))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "or as a dictionary indexed by the variables names:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_data_by_group(\"inputs\", True))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Access by variable name\nWe can get the data by variables names,\neither as a dictionary indexed by the variables names (default option):\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_data_by_names([\"x\"]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "or as an array:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_data_by_names([\"x\", \"y\"], False))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Access all data\nWe can get all the data, either as a large array:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_all_data())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "or as a dictionary indexed by variables names:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_all_data(as_dict=True))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can get these data sorted by category, either with a large array for each\ncategory:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_all_data(by_group=False))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "or with a dictionary of variables names:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(dataset.get_all_data(by_group=False, as_dict=True))"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.13"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}