Member-only story

How schema and data type mapping in AZURE Data Factory can be done while copying data from source to sink

Schema and data type mapping in copy activity

Default mapping

By default, copy activity maps source data to sink by column names in case-sensitive manner. If sink doesn’t exist, for example, writing to file(s), the source field names will be persisted as sink names. If the sink already exists, it must contain all columns being copied from the source. Such default mapping supports flexible schemas and schema drift from source to sink from execution to execution — all the data returned by source data store can be copied to sink.

If your source is text file without header line, explicit mapping is required as the source doesn’t contain column names.

Explicit mapping

You can also specify explicit mapping to customize the column/field mapping from source to sink based on your need. With explicit mapping, you can copy only partial source data to sink, or map source data to sink with different names, or reshape tabular/hierarchical data. Copy activity:

  1. Reads the data from source and determine the source schema.
  2. Applies your defined mapping.
  3. Writes the data to sink.

You can configure the mapping on Data Factory authoring UI -> copy activity -> mapping tab, or programmatically specify the…

--

--

Anirban Das, Cloud, Data & AI Innovation Architect
Anirban Das, Cloud, Data & AI Innovation Architect

Written by Anirban Das, Cloud, Data & AI Innovation Architect

Global Lead - Cloud,Data & AI Innovation,Leads AI innovation, focused on building and implementing breakthrough AI research and accelerating AI adoption in org.

No responses yet