<?xml version="1.0" encoding="UTF-8"?>
<item xmlns="http://omeka.org/schemas/omeka-xml/v5" itemId="19424" public="1" featured="0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://omeka.org/schemas/omeka-xml/v5 http://omeka.org/schemas/omeka-xml/v5/omeka-xml-5-0.xsd" uri="https://archives.christuniversity.in/items/show/19424?output=omeka-xml" accessDate="2026-04-08T10:19:06+00:00">
  <collection collectionId="16">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="51377">
                <text>Conference Papers</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
  </collection>
  <itemType itemTypeId="28">
    <name>Conference Paper</name>
    <description>Faculty Publications- Conference Papers</description>
  </itemType>
  <elementSetContainer>
    <elementSet elementSetId="1">
      <name>Dublin Core</name>
      <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
      <elementContainer>
        <element elementId="50">
          <name>Title</name>
          <description>A name given to the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167560">
              <text>File Validation intheData Ingestion Process Using Apache NiFi</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="49">
          <name>Subject</name>
          <description>The topic of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167561">
              <text>Apache NiFi; Custom processor; Data ingestion; File validation; Frequency validation</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="41">
          <name>Description</name>
          <description>An account of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167562">
              <text>In the industries of today, development and maintenance of data pipelines is of paramount importance. With large volumes of data being generated across industries on a continuous basis, there is a growing need to process and store this ingested data in a fast, and efficient manner. Apache NiFi is one such tool which possesses crucial capabilities that can be used to enhance, modify, and automate data pipelines. However, automation of the ingestion process creates certain inherent issues which, without being resolved, tend to be detrimental to the entire ingestion process. These issues vary in nature, ranging from corrupted data to changes in the file schema, to name a few. In this paper, a solution to this problem is proposed. By exploiting Apache NiFis custom processor development capabilities, problem-specific processors can be designed and deployed which can ensure accurate validation of the ingestion process on a real-time basis. To demonstrate this, two processors were developed as a proof-of-concept, which tackle specific file-related validation issues in the ingestion processthat of the file size, and, the ingestion frequency. These custom-built processors are designed to be inserted into the pipeline at key points to ensure that the ingested data is validated against certain standards and requirements. Having successfully demonstrated its capabilities, the paper presents the exploitation of Apache NiFis custom processor capabilities as a potential way forward to resolve the plethora of ingestion issues in industry, today.  The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="39">
          <name>Creator</name>
          <description>An entity primarily responsible for making the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167563">
              <text>Irfan M.; Gangadhar A.; George J.</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="48">
          <name>Source</name>
          <description>A related resource from which the described resource is derived</description>
          <elementTextContainer>
            <elementText elementTextId="167564">
              <text>Lecture Notes in Networks and Systems, Vol-922 LNNS, pp. 299-310.</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="45">
          <name>Publisher</name>
          <description>An entity responsible for making the resource available</description>
          <elementTextContainer>
            <elementText elementTextId="167565">
              <text>Springer Science and Business Media Deutschland GmbH</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="40">
          <name>Date</name>
          <description>A point or period of time associated with an event in the lifecycle of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167566">
              <text>2024-01-01</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="43">
          <name>Identifier</name>
          <description>An unambiguous reference to the resource within a given context</description>
          <elementTextContainer>
            <elementText elementTextId="167567">
              <text>&lt;a href="https://doi.org/10.1007/978-981-97-0975-5_27" target="_blank" rel="noreferrer noopener"&gt;https://doi.org/10.1007/978-981-97-0975-5_27&lt;/a&gt;
&lt;br /&gt;&lt;br /&gt;&lt;a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85197244973&amp;amp;doi=10.1007%2F978-981-97-0975-5_27&amp;amp;partnerID=40&amp;amp;md5=03979cda62fcb4fad86937037c4ddff7" target="_blank" rel="noreferrer noopener"&gt;https://www.scopus.com/inward/record.uri?eid=2-s2.0-85197244973&amp;amp;doi=10.1007%2f978-981-97-0975-5_27&amp;amp;partnerID=40&amp;amp;md5=03979cda62fcb4fad86937037c4ddff7&lt;/a&gt;</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="47">
          <name>Rights</name>
          <description>Information about rights held in and over the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167568">
              <text>Restricted Access</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="46">
          <name>Relation</name>
          <description>A related resource</description>
          <elementTextContainer>
            <elementText elementTextId="167569">
              <text>ISSN: 23673370; ISBN: 978-981970974-8</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="42">
          <name>Format</name>
          <description>The file format, physical medium, or dimensions of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167570">
              <text>Online</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="44">
          <name>Language</name>
          <description>A language of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167571">
              <text>English</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="51">
          <name>Type</name>
          <description>The nature or genre of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="167572">
              <text>Conference paper</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="38">
          <name>Coverage</name>
          <description>The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant</description>
          <elementTextContainer>
            <elementText elementTextId="167573">
              <text>Irfan M., CHRIST (Deemed to be University), Bangalore, 560029, India; Gangadhar A., Binghamton University, State University of New York, Binghamton, 13902, NY, United States; George J., CHRIST (Deemed to be University), Bangalore, 560029, India</text>
            </elementText>
          </elementTextContainer>
        </element>
      </elementContainer>
    </elementSet>
  </elementSetContainer>
</item>
