Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    FTC fines GoodRx for sharing consumer health data with advertisers

    February 2, 2023

    What we know about Milwaukee’s 2024 Republican convention

    February 2, 2023

    More and more business owners are turning to artificial intelligence to stay competitive

    February 2, 2023
    Facebook Twitter Instagram
    Facebook Twitter Instagram
    Zepp News
    • Business
    • Entertainment
    • Health
    • Politics
    • Sports
    • Technology
    Zepp News
    Technology

    Onehouse raises $25M to expand its Apache Hudi tech, bringing order to data lake houses

    shivachetanbijjal@gmail.comBy shivachetanbijjal@gmail.comFebruary 2, 2023No Comments5 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email

    View all on-demand sessions from the Smart Security Summit here.


    Managed data lake provider Onehouse announced today that it has raised $25 million in Series A funding to help further its go-to-market and technical efforts based on the open source Apache Hudi project.

    A year ago, in February 2022, Onehouse emerged as the first commercial vendor to provide support and services for Apache Hudi. Hudi, an acronym for Hadoop Upserts Deletes and Incrementals, can trace its roots back to Uber in 2016, when it was originally developed as a technology to help sort the massive amounts of data stored in data lakes.

    Hudi technology provides a data lake table format and services that facilitate clustering, archiving, and data replication. Hudi competes with several other open source data lake table technologies, including Apache Iceberg and Databricks Delta Lake.

    Onehouse’s goal is to create a cloud hosting service that helps organizations benefit from a hosted data lake house. Along with the new funding, Onehouse also announced its Onetable initiative, which aims to enable users of Iceberg and Delta Lake to interoperate with Hudi. With Onetable, organizations can use Hudi to ingest data into a data lake while still benefiting from query engine technologies running on Iceberg — including Snowflake — and Databricks’ Delta Lake.

    event

    Smart Security Summit On Demand

    Learn about the critical role of AI and ML in cybersecurity and industry-specific case studies. Watch the on-demand session today.

    look here

    “We’re really trying to build a new way of thinking about data architecture,” Onehouse founder and CEO Vinoth Chandar told VentureBeat. “We strongly believe that people should start with an interoperable lakehouse.”

    Understanding Data Lakehouse Trends

    Data Lakehouse is a term originally coined by Databricks.

    The goal of Data Lake House is to leverage the best aspects of data lakes, data lakes provide massive data storage, and data warehouses provide structured data services for query and data analysis. A 2022 report from Databricks identified a number of key benefits of a data lake house approach, including improved data quality, increased productivity, and better data collaboration.

    A key component of the Data Lakehouse model is the ability to apply structure to the data lake, which is where open source data lake tabular formats including Hudi, Delta Lake, and Iceberg come in. Multiple vendors are now building complete platform formats using these tables as a basis.

    Among the many supporters of Apache Iceberg, Cloudera launched its data lake house service in August 2022. Dremio is another strong Iceberg supporter, using it as part of its Data Lake House platform. Even Snowflake, one of the pioneers of the cloud data warehouse concept, now supports Iceberg.

    Onetable is not another data lake table format

    At their core, today’s major data lake formats, including Hudi, Delta Lake, and Iceberg, are files that organizations expect to be able to use for analytics, business intelligence, or operations.

    One challenge that has emerged, however, is that vendor technologies are increasingly vertically integrated—combining data storage and query engines. Kyle Weller, director of product at Onehouse, explained that he’s seeing organizations get confused about which vendor to choose based on the supported data lake tabular approach. The Onetable approach aims to abstract the differences between data lake table formats to create an interoperability layer.

    “Onehouse’s goal and mission is to decouple the data processing data query engine from how the core data infrastructure operates,” Weller told VentureBeat.

    Weller added that the foundation of many data lakes today are files stored in the Apache Parquet data storage format. Onetable essentially provides a metadata layer on top of Parquet that makes it easy to convert from one table format to another.

    Where Onetable fits the data lake house use case

    Chandar noted that Hudi offers advantages over other formats, such as transactional replication and fast data ingestion.

    One potential use case where he sees Onetable’s capabilities as a good fit is for organizations that use Hudi for high-volume data ingestion, but want to be able to use the data with other query engines or technologies, such as a Snowflake data cloud deployment, for some type of analysis.

    Many companies, whose data is stored in data warehouses, are increasingly deciding to build a data lake, either because of cost considerations or because they want to start a new data science team, Chandar said. The first thing these organizations do is data ingestion, bringing all their transactional data into the lake, which is where Chandar said the Hudi and Onehouse services excel.

    Now, taking advantage of Onetable technology, the same organization that brings data into Onehouse can also query and analyze the data using other technologies such as Snowflake and Databricks.

    Looking ahead to the Hudi and Onehouse platforms, Chandar emphasized that further optimization of an organization’s ability to leverage data quickly will remain a key theme.

    “We’ve announced in the Hudi project that we hope to add a caching layer at some point,” he said. “We’re thinking about anything around the data and how we can really optimize it.”

    VentureBeat’s mission is the digital town square where technology decision makers gain knowledge about transformative enterprise technologies and transactions. Discover our newsletter.

    Source link

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    shivachetanbijjal@gmail.com
    • Website

    Related Posts

    More and more business owners are turning to artificial intelligence to stay competitive

    February 2, 2023

    MACOM Technology Solutions Holdings Inc. fell 1.68% to $67.38 after better-than-expected earnings

    February 2, 2023

    Ex-tech company employee pleads guilty to stealing classified data and holding company to ransom | NASA-SDNY

    February 2, 2023

    Leave A Reply Cancel Reply

    Our Picks

    Noise-Cancelling Headphones For a Superb Music Experience

    January 15, 2020

    Harry Potter: 10 Things Dursleys That Make No Sense

    January 15, 2020

    Dubai-Based Yacht Company is Offering Socially-Distanced Luxury

    January 15, 2020

    The Courier – a New Song with Benedict Cumberbatch

    January 14, 2020
    About Us

    This website is all about Tech Health Fitness Business and many other topic that very helpfull for everyone.

    Thank You.

    Our Picks

    Noise-Cancelling Headphones For a Superb Music Experience

    January 15, 2020

    Harry Potter: 10 Things Dursleys That Make No Sense

    January 15, 2020

    Dubai-Based Yacht Company is Offering Socially-Distanced Luxury

    January 15, 2020

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Facebook Twitter Instagram Pinterest
    • Home
    • Buy Now
    © 2023 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.