.. index::
   single: ebas_extract; program

.. program:: ebas_extract

.. _EBASprogram_ebas_extract:

ebas_extract
============

The program *ebas_extract* extracts data from the EBAS database.
Currently those file formats are supported:

   * EBAS NASA Ames
   * a simple CSV format (mostly for testing)
   * XLM (used for machine to machine communication with the ACTRIS webservice)

Synopsis
--------

::

   ebas_extract.py [-h] [--version] [--cfgfile CFGFILE]
                   [--loglevconsole LOGLEVCONSOLE]
                   [--loglevfile LOGLEVFILE] [--logfile LOGFILE]
                   [--profile] [--nodb] [--dbHost DBHOST] [--db DB]
                   [--dbUser DBUSER] [--dbPasswd DBPASSWD] [--do_id DO_ID]
                   [--setkey SETKEY] [--station STATION]
                   [--project PROJECT] [--instrument INSTRUMENT]
                   [--component COMPONENT] [--matrix MATRIX]
                   [--group GROUP] [--fi_ref FI_REF] [--me_ref ME_REF]
                   [--resolution RESOLUTION] [--statistics STATISTICS]
                   [--time TIME] [--state STATE] [--us_id US_ID]
                   [--format FORMAT] [--multicolumn] [--expand]
                   [--precip_amount] [--xmlwrap | --createfiles]
                   [--destdir DESTDIR]
                   [--flags {all,compress,none,one-or-all}]
                   [--metadata_options METADATA_OPTIONS]
                   [--fileindex FILEINDEX]


Commandline arguments
---------------------

.. include:: ./include/commandline_arguments/intro.include

General arguments
^^^^^^^^^^^^^^^^^

.. include:: ./include/commandline_arguments/general.include


Configuration arguments
^^^^^^^^^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/config.include


Logging arguments
^^^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/logging.include


Database arguments
^^^^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/db.include


Commandline arguments for dataset selection criteria
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Dataset selection criteria are used to define a set of datasets to be exported
by *ebas_extract*.

.. include:: include/commandline_arguments/ds_criteria.include


Time interval criteria
^^^^^^^^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/time.include


History arguments
^^^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/state.include


On behalf of...
^^^^^^^^^^^^^^^

.. include:: include/commandline_arguments/us_id.include



Arguments specific to *ebas_extract*
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. option:: --format=FORMAT

  file format for output:
    * ``NasaAmes``
    * ``CSV``
    * ``XML``
    * ``NetCDF``

  .. versionadded:: 3.00.08 NetCDF

   (default: NasaAmes)

.. option:: --multicolumn

   multicolumn output, multiple variables per file (default: False)

.. option:: --expand

  expand multicolumn output, add associated datasets (default: False)

.. option:: --precip_amount, --amount

   add associated precipitation amount datasets for each precipitation
   concentration dataset (default: False)

.. option:: --xmlwrap

   wrap output in xml containers (default: False)

.. option:: --createfiles         

   create files instead of output to stdout (default: False)

.. option:: --destdir=DESTDIR

   set output directory for files (only allowed after :option:`--createfiles`)
   (default: None)

.. option:: --flags=FLAG_OUTPUT_STYLE

   flag columns style:
   
      * ``one-or-all`` (default):
         If all variables share the same sequence of flags
         throughout the whole file, use one flag column as last
         column.
         Else, one flag column per variable is used.
         This is the default behavior starting from EBAS 3.0.
      
      * ``compress``:
         If multiple variables share the same sequence of flags
         throughout the whole file, one flag column after this
         group of variables is used.
         This produces files as narrow as possible without
         losing any flag information.
         This used to be the default behavior up to EBAS 2.2.
      
      * ``all``
         All variables get a dedicated flag column.
      
      * ``none``
         No flag columns are exported. Invalid or missing data
         are both reported as MISSING value. This should be
         used very carfully, as information is LOST on export!
         Intended for non expert uses, as the easiest approach
         to process only valid data, without bothering about
         the EBAS flag system.
         Note: Detection limit values (flag 781) are exported
         as value/2.0 (only in this case, when no flag
         information is extracted).
      
         As a general rule, a flag column applies ALWAYS to all
         preceding variables after the previous flag column.
         (default: one-or-all)

.. option:: --metadata_options=METADATA_OPTIONS

   metadata options for output (curretnly only one available option):

      * ``setkey``
         Include the dataset key in the metadata output (not included by default)

   (default: None)

   .. versionadded:: 3.00.08

.. option:: --fileindex=FILEINDEX

   file path and name for ebas file index database
   (sqlite3). Prepend a plus sign (+) to the filename in
   order to add to an existing database instead of
   creating a new one. (default: None)

.. option:: --diffxml=DIFFXML

   file path and name for ebas diff xml output.
   Only useful in conjunction with the :option:`--diff` argument.
   Creates an output xml file with deleted and added data
   intervals for each dataset. This xml file can be used
   to synchronize a local repository at the recipient
   side if diff extracts are used. (default: None)

   See :ref:`EBAS_differential_extract` for more information on differential
   extracts.
   Please refer to the
   :ref:`FileFormatSpecification_EBAS_diffxml` for details on this file format.

   .. versionadded:: 3.00.08

.. index::
   pair: ebas_extract; Differential extracts

.. _EBAS_differential_extract:

Differential extracts
---------------------

.. versionadded:: 3.00.08

A special case for utilizing :ref:`historic states <EbasHistory>` of the EBAS
database is the possibility for generating differential data extracts.

Using the arguments :option:`--diff` and :option:`--diffxml` (and maybe
:option:`--state` in addition), ``ebas_extract`` can produce differential
extracts.

This is most useful for "updating" a data user about changes in the database
since the last extract she received.

 * The :option:`--diff` argument makes sure, only data that exist in the
   database and have been changed since a specific date are extracted
 * The argument :option:`--diffxml` generates an xml file with all changes
   relative to the old database state (also includes information on data
   intervals that have been deleted).  Please refer to the
   :ref:`FileFormatSpecification_EBAS_diffxml` for details on this file format.

Example:

   ``ebas_extract --...  --state 2015-12-01 --diff 2015-11-01 --diffxml diffexport.xml``

   Produces an extract with various filter conditions (``--...``) at the database
   state of 1st December 2015 midnight, but extracts only data changed between
   1st November and 1 December. Additionally to the datafiles, there will be a
   file named ``diffexport.xml`` containing additional information about the
   changes.

Forward and reversed differential extracts
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Although the usual case would be a *forward* differential extract (the timestamp
specified by the :option:`--diff` is earlier in time then the selected database
state), *backward* differential extracts are technically possible.

A *forward* differential extract would generate the recipe for a data user to
*update* a data extract in an old state to a newer state. This is what data
users usually need.