[{"content":" # DBT Studio - Configure Connection ","date":"2025-09-24T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---configure-connection/","title":"DBT Studio - Configure Connection"},{"content":" # DBT Studio - Features Overview ","date":"2025-09-24T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---features-overview/","title":"DBT Studio - Features Overview"},{"content":" # DBT Studio - Configuring AI Providers ","date":"2025-09-23T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---configuring-ai-providers/","title":"DBT Studio - Configuring AI Providers"},{"content":" # DBT Studio - Getting Started ","date":"2025-09-22T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---getting-started/","title":"DBT Studio - Getting Started"},{"content":" # DBT Studio - (AI) Generate Analytics ","date":"2025-09-21T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---ai-generate-analytics/","title":"DBT Studio - (AI) Generate Analytics"},{"content":" # DBT Studio - Create empty project and load data from Cloud ","date":"2025-09-20T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---create-empty-project-and-load-data-from-cloud/","title":"DBT Studio - Create empty project and load data from Cloud"},{"content":" # DBT Studio - Jaffle Shop Import ","date":"2025-09-19T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---jaffle-shop-import/","title":"DBT Studio - Jaffle Shop Import"},{"content":" # DBT Studio - Import Project from ZIP and show DBT Docs ","date":"2025-09-18T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---import-project-from-zip-and--show-dbt-docs/","title":"DBT Studio - Import Project from ZIP and  show DBT Docs"},{"content":" # DBT Studio - The Secret Behind DBT Studio ","date":"2025-09-17T00:00:00Z","permalink":"https://rosettadb.io/videos/dbt-studio---the-secret-behind-dbt-studio/","title":"DBT Studio - The Secret Behind DBT Studio"},{"content":" # Accelerate Your Analytics Workflow with Rosetta DBT Studio In the rapidly evolving world of modern data engineering, DBT (Data Build Tool) has become an indispensable part of the analytics engineering workflow. It empowers data teams to transform raw data in their warehouse into clean, tested, and documented data products that are ready for analytics. But as powerful as dbt Core™ is, onboarding new users and scaling development efficiently across teams can still be challenging.\nThat’s where Rosetta DBT Studio comes in, a streamlined graphical desktop application designed to enhance the DBT experience, simplify project setup, and supercharge productivity using AI.\n# Why DBT Matters Today DBT is becoming the de facto standard for data transformation in modern data stacks. It brings software engineering best practices, version control, modularity, testing, and documentation, to analytics engineering. With DBT, data teams can:\nWrite modular SQL to transform data in the warehouse Add tests and documentation to ensure data quality Maintain version-controlled projects with Git Schedule and orchestrate models in production However, building and maintaining DBT (dbt™ Core) projects from scratch still requires significant setup time, especially when scaling across new teams or use cases.\n# Introducing Rosetta DBT Studio Rosetta DBT Studio bridges this gap by making dbt Core™ not only easier to use but also drastically faster to adopt. It’s a powerful desktop app (compatible with macOS, Windows, and Linux) that combines AI-driven automation, dbt™ Core integration, and developer tools to streamline your entire analytics engineering lifecycle.\nAt the heart of analytics success is the ability to convert messy raw data into business-ready insights. With Rosetta DBT Studio, Data Engineers can easily navigate and implement robust pipelines that follow a trusted data modelling pattern:\nRaw Layer → Business Model → Reports/Dashboards Raw Layer → Staging Layer → Business Model → Reports/Dashboards Raw Layer → Staging Layer → Enhanced Layer → Business Model → Reports/Dashboards Rosetta DBT Studio accelerates this process by offering:\nAI-guided DBT model generation: Configure a connection to your warehouse and let Rosetta analyse your source tables, suggesting staging models and transformations instantly. Layered structure templates: Pre-built templates for staging, enhanced, and business models allow you to scaffold your entire pipeline in minutes, not days. Intelligent suggestions: From data type checks to best-practice naming conventions, Rosetta helps you enforce consistency across your models. Incremental logic support: Automatically identify unique keys and timestamp columns to implement incremental strategies with minimal setup. Business model builder: Convert enhanced data into meaningful business entities like Orders, Revenue, or Customer Segments, ready for reporting and visualisation tools like Tableau, Power BI, or Looker. By abstracting complexity and automating boilerplate work, Rosetta DBT Studio empowers teams to focus on business logic, not plumbing. Whether you’re defining KPIs or enabling executive dashboards, Rosetta ensures your data foundation is solid, scalable, and production-ready.\n# What you can expect dbt™ Core Integration: Execute all your dbt commands like run, test, or build directly from the Studio. You still can use the command line if you need it. Git Integration: Connect to Git, manage branches, and version your transformations without leaving the interface. Multi-Database Support: Out-of-the-box support for Postgres, Snowflake, BigQuery, Databricks, DuckDB, and Redshift. Built-in SQL Explorer: Run queries against your data warehouse directly from the Studio interface. AI: Business model generation, basic transformations, incremental logic, analytics query generation. # End-to-End Use Case: From Raw Data to Dashboard in Minutes Let’s walk through how Rosetta DBT Studio can take a project from raw data to production-ready dashboards in a fraction of the time:\nDownload Rosetta DBT Studio to follow along.\n# 1. Configure a connection to your Warehouse (Raw Layer) Assume you have a few tables in your warehouse under the raw schema. You point Rosetta DBT Studio to this layer. There is also a Getting Started Project, so you can be up and running quickly.\n# 2. Generate the Staging Layer With one click, Rosetta scans the RAW tables and automatically generates staging models using DBT conventions. It even includes schema tests, such as not_null and unique, based on data profiling.\nAs a result, you will have new models created in your dbt project under the staging directory. For each table, you will find the .sql file and .yaml file containing the schema and dbt tests.\n# 3. Generate the Enhanced Layer Using AI, Rosetta builds enhanced (intermediate) models, including logic for incremental loading, column transformation, and business logic enrichment. It also recommends unique and incremental keys, streamlining SCD-type logic setup.\nHere you can use the AI Assistant to automatically determine the UNIQUE_KEY_COLUMNS and INCREMENTAL_COLUMN in your model. In order to use the AI capabilities you will need to configure your OpenAI API Key in the Settings page.\n# 4. Generate Business Models Now comes the magic. You define what kind of business model you need (e.g., “Customer Lifetime Value”, “Revenue by Region”), and Rosetta’s AI generates the full SQL model with logic.\nFor the prompt, let’s use: “Course enrolments and grading over semesters”.\nTo generate a business model you will need to configure the OpenAI API Key in the Settings.\n# 5. Generate Dashboard-Ready Analytics Queries Finally, Rosetta DBT Studio can output ready-to-use SQL queries tailored for your dashboard tool (e.g., Looker, Power BI, Tableau). These queries are built on top of your business models and are structured for fast consumption.\nGo to your business model file and use the AI Assistant to generate dashboard-ready analytics queries.\n# Time Saved = Value Delivered What traditionally took days or weeks of manual modelling and documentation now happens in minutes or hours. Rosetta DBT Studio allows your team to focus on strategy and analysis instead of boilerplate SQL and repetitive setup.\nRosetta DBT Studio is advancing the evolution in analytics engineering productivity. Whether you’re a solo data practitioner or part of a growing data team, it allows you to leverage the full power of dbt™ Core, faster, smarter, and with less friction.\nBest of all, you get all of this absolutely for FREE and it is fully open source!\n","date":"2025-07-15T00:00:00Z","image":"https://rosettadb.io/blogs/stop-using-dbt-core/DBT_Superhero_hu_a5117d94322334e4.png","permalink":"https://rosettadb.io/blogs/stop-using-dbt-core/","title":"Stop using dbt Core™!"},{"content":"RosettaDB is a powerful tool for managing database schemas, enabling transformation and management of database objects across different databases. Combined with dbt (Data Build Tool), it allows you to generate dbt models directly from your database schema, creating structured, ready-to-use datasets for analysis. This guide demonstrates how to generate dbt models using RosettaDB from your database schema.\n# Prerequisites Download JDBC drivers for your databases, install RosettaDB from the releases page, and refer to the Quick Start Guide for setup instructions.\n# Setting Up RosettaDB # 1. Initialize a New RosettaDB Project To create a new project, use:\nrosetta init dbt_postgres_project This will create a project directory with a main.conf file for defining your database connections.\n# 2. Configure Database Connections Edit the main.conf file to define your database connection. Here’s an example configuration for a general database connection:\nconnections: - name: postgres_conn databaseName: analysis_db dbType: postgres url: jdbc:postgresql://localhost:5432/analysis_db userName: username password: password # 3. Extract DBML Models Run the rosetta extract command to generate DBML models from your database schema.\nrosetta extract -s postgres_conn Now that you have the DBML models, you can proceed to generate dbt models.\n# Generating dbt Models Use the rosetta dbt command to convert your extracted DBML into dbt models:\nrosetta dbt -s postgres_conn This command will produce dbt models based on the extracted schema, ready to integrate into your dbt project.\n# Example dbt Model Output Here’s an example of what the generated dbt models might look like, covering multiple tables for better context.\n**model.yaml**\nversion: 2 sources: - name: Retail_Analysis description: \u0026#34;Data source for retail analysis\u0026#34; tables: - name: sales_transactions columns: - name: transaction_id tests: - not_null - unique - name: product_id tests: - not_null - name: customer_id tests: - not_null - name: transaction_date tests: - not_null - name: amount tests: [] - name: products columns: - name: product_id tests: - not_null - unique - name: product_name tests: [] - name: category tests: [] - name: price tests: [] - name: customers columns: - name: customer_id tests: - not_null - unique - name: first_name tests: [] - name: last_name tests: [] - name: email tests: - unique - name: registration_date tests: [] Example Models\n1. Sales Transactions Model\nwith sales_transactions as ( select transaction_id, product_id, customer_id, transaction_date, amount from {{ source(\u0026#39;Retail_Analysis\u0026#39;, \u0026#39;sales_transactions\u0026#39;) }} ) select * from sales_transactions 2. Products Model\nwith products as ( select product_id, product_name, category, price from {{ source(\u0026#39;Retail_Analysis\u0026#39;, \u0026#39;products\u0026#39;) }} ) select * from products 3. Customers Model\nwith customers as ( select customer_id, first_name, last_name, email, registration_date from {{ source(\u0026#39;Retail_Analysis\u0026#39;, \u0026#39;customers\u0026#39;) }} ) select * from customers Summary\nBy following these steps, you can effectively generate dbt models from your PostgreSQL schema using RosettaDB. With these models, you can run transformations, apply tests, and document your data workflow, enhancing data quality and usability. This process not only streamlines your data management but also aligns with best practices in data analytics.\nFor more details, check the official RosettaDB documentation or reach out to the community if you need further assistance.\n","date":"2024-10-30T00:00:00Z","image":"https://rosettadb.io/use-cases/generating-dbt-models-using-rosettadb/rosetta-dbt_hu_8c77298c013e915f.png","permalink":"https://rosettadb.io/use-cases/generating-dbt-models-using-rosettadb/","title":"Generating dbt Models using RosettaDB"},{"content":"As the world is becoming increasingly data driven, the need for powerful and intuitive tools to manage and interpret this data has never been more critical. This is where RosettaDB comes in, utilizing the advanced capabilities of OpenAI’s language models with the robustness of database management. Let’s explore how RosettaDB is setting a new standard for data exploration.\nRosettaDB emerged as an open source tool in the landscape of database management, especially for migrating data across different platforms. It acts as a DDL (Data Definition Language) transpiler, enabling the translation of database schemas from one database system to another with no manual intervention.\nArtificial Intelligence (AI), particularly in the form of Large Language Models (LLMs) like those developed by OpenAI, is becoming increasingly prolific. One of the most exciting applications of LLMs in the context of database management is their ability to generate SQL queries. This capability allows users, even those with minimal technical expertise, to interact with databases in a more intuitive way. By simply describing data needs in natural language, users can retrieve information without needing to know the SQL syntax.\nRosettaDB introduces a groundbreaking feature: the “rosetta query …”. This functionality enables users to write queries in natural language to explore their data and even export results into a CSV file. This means that instead of wrestling with complex query syntax, users can ask questions in plain English and receive answers directly, greatly simplifying data analysis tasks.\nRosettaDB doesn’t stop at queries. It also supports the generation of various DDL and DML scripts. This allows users to not only fetch data but also modify the database structure or manage the data itself through commands generated by the tool. To ensure security and integrity, RosettaDB restricts its operations to SELECT commands when executing queries generated from natural language. This prevents accidental data modifications or deletions, making it a safer choice for users who are experimenting with data queries or those in a learning phase.\nIn the following section we’ll learn how to use RosettaDB in practice to achieve the above mentioned capabilities.\nDownload and configure RosettaDB on your machine. Download all the required JDBC drivers. For more details on this step please refer to the Getting Started section of RosettaDB docs https://github.com/rosettadb/rosetta#getting-started Create a new rosetta project using the init command rosetta init [PROJECT_NAME] 3. Edit the main.conf file to configure the database connection settings. At the top of the file, include your OpenAI API key and, optionally, specify the model you wish to use (default is gpt-3.5-turbo).\nNote: You will need to register for an account with OpenAI if you don’t already have one.\nExample:\nopenai_api_key: \u0026#34;sk-abcdefghijklmno1234567890\u0026#34; openai_model: \u0026#34;gpt-4\u0026#34; connections: - name: pg databaseName: postgres schemaName: rosseta_testing dbType: postgres url: jdbc:postgresql://\u0026lt;HOST\u0026gt;:\u0026lt;PORT\u0026gt;/\u0026lt;DATABASE\u0026gt;?user=\u0026lt;USER\u0026gt;\u0026amp;password=\u0026lt;PASSWORD\u0026gt; userName: \u0026lt;USER\u0026gt; password: \u0026lt;PASSWORD\u0026gt; 4. Run the rosetta extract command to generate the DBML models from the PostgreSQL database tables.\nrosetta extract -s pg This command analyzes your database schema and creates a workspace with the corresponding DBML model, which you can review and modify if needed.\n5. Run the rosetta query command to write queries in natural language to explore the data, the output will be written in a CSV file\nExamples:\nrosetta query -s psg -q \u0026#34;Find the most borrowed book title.\u0026#34; rosetta query -s psg -q \u0026#34;Retrieve the names of students who have borrowed more than five books\u0026#34; --limit 10 rosetta query -s psg -q \u0026#34;Find the total number of books borrowed by each student\u0026#34; --output test.csv As you can see in the examples we have a couple of arguments you can add for specific scenarios:\nAdditional Arguments:\n-l --limit : Limit the number of rows in the response (Optional). The default value is 200. --no-limit : No limit on the number of rows in the response (Optional). --output : Specify the output directory or file (Optional). By default, if you do not use the --output argument, the CSV files will be saved in the pg/data/ directory with a filename based on the query and a timestamp.\nRosettaDB represents a significant leap forward in data management and exploration. By integrating OpenAI’s powerful LLM AI, RosettaDB makes it easier and more accessible for everyone to interact with data. Whether you’re a developer, a data scientist, or just someone curious about the insights hidden in your data, RosettaDB offers a versatile and user-friendly platform to explore and manipulate data efficiently. As data continues to drive decisions more than ever, tools like RosettaDB will become crucial in harnessing the power of information in the digital age.\nDive into RosettaDB and start transforming your data interaction experience today!\n","date":"2024-07-16T00:00:00Z","image":"https://rosettadb.io/blogs/explore-your-data-using-ai-/rosettadb_hu_8229888061988c7a.png","permalink":"https://rosettadb.io/blogs/explore-your-data-using-ai-/","title":"Explore your DATA using AI"},{"content":" # Automating Schema \u0026amp; Data Migration with RosettaDB and Kinetica Learn how you can leverage a modern DI stack for building \u0026ldquo;Data Products\u0026rdquo; with RosettaDB, Git and GitHub Actions. We partnered with Kinetica for this event to demonstrate how to build automated data pipelines for delivering data along with the requisite schema management.\nThis webinar was inspired by the blog: Efficient Change Schema Capture (CSC) and Schema Translations with RosettaDB\n","date":"2024-07-10T00:00:00Z","permalink":"https://rosettadb.io/videos/automating-schema--data-migration-with-rosettadb-and-kinetica/","title":"Automating Schema \u0026 Data Migration with RosettaDB and Kinetica"},{"content":" RosettaDB is an open-source toolkit designed for efficient migration of database schemas. It automates the conversion of schemas and data types, ensuring smooth transitions while maintaining data integrity. The tool supports complex migrations by handling differences in data types, thereby reducing costs and technical complexities. It also provides you with the capabilities of Change Schema Capture (CSC) from one database to another. The process includes initial configuration, schema extraction, DDL generation, and schema application, streamlining the transition from on database to another.\nRosettaDB is an open source toolkit that enables seamless Information Lifecycle Management (ILM), including migration of databases from one database to another. It also provides a DDL (Data Definition Language) transpiler, enabling the translation of database schemas from one database system to another with no manual intervention. This capability is crucial for organizations looking to transition their data storage solutions to more sophisticated or specialized database systems.\nRosettaDB helps automate the conversion of schemas, data types, and database structures, ensuring that the transition is not only smooth but also retains the integrity and functionality of the original database system. This reduces the complexity and risk associated with database migrations, and significantly cuts down the time and resources needed for such projects.\nKineticaDB is a next-generation database that harnesses the power of GPU acceleration to handle complex analytical computations with extraordinary speed and efficiency. In the world of big data, where rapid processing and analysis of large volumes of data are crucial, Kinetica offers significant advantages. Its architecture is designed to manage massive datasets and high velocity real-time data feeds, making it an exceptional choice for applications in finance, retail, healthcare, and energy sectors that require real-time analytics and decision-making.\nThe use of GPUs allows Kinetica to perform parallel data processing, dramatically speeding up the analysis times compared to traditional CPU-bound databases. This capability is particularly beneficial for machine learning and AI-driven applications, where faster data processing translates directly into quicker insights and more responsive decision-making systems. Moreover, Kinetica supports geospatial data types and functions, which are essential for location-based analytics and operational intelligence.\nA major challenges companies face is the migration of databases, schemas, and tables from one database vendor to another. This task is particularly complex when moving from traditional relational databases (RDBMS) like PostgreSQL to cloud-based databases designed for analytics, such as KineticaDB.\nThese challenges include:\nData Integrity and Consistency: Ensuring that data remains accurate and consistent post-migration. Schema and Data Type Compatibility: Different databases support different data types and structures, which can lead to significant hurdles during migration. Performance Considerations: Migrations can affect the performance of applications, especially when moving from systems optimized for transactional processing to those optimized for analytical processing. Cost and Complexity: Migrations can be costly and require significant technical expertise, often necessitating the use of intermediary tools like RosettaDB. Migrating from a relational database to a cloud database optimized for analytics involves not just moving data but transforming the way data is structured and accessed. This requires thoughtful planning and execution to leverage the new platform’s strengths without losing the value of the legacy data. In order to achieve the above requirements use RosettaDB and the following steps:\n1. Download and configure RosettaDB on your machine. Download all the required JDBC drivers. For more details on this step please refer to the Getting Started section of RosettaDB docs https://github.com/rosettadb/rosetta#getting-started\n2. Create a new rosetta project using the init command\nrosetta init [PROJECT_NAME] 3. Edit the main.conf to configure the connection for the PostgreSQL and KineticaDB\nExample:\nconnections: - name: pg databaseName: postgres schemaName: rosseta_testing dbType: postgres url: jdbc:postgresql://\u0026lt;HOST\u0026gt;:\u0026lt;PORT\u0026gt;/\u0026lt;DATABASE\u0026gt;?user=\u0026lt;USER\u0026gt;\u0026amp;password=\u0026lt;PASSWORD\u0026gt; userName: \u0026lt;USER\u0026gt; password: \u0026lt;PASSWORD\u0026gt; tables: - \u0026lt;TABLE_1\u0026gt; - \u0026lt;TABLE_2\u0026gt; - name: kinetica databaseName: kinetica schemaName: ki_home dbType: kinetica url: jdbc:kinetica:URL=http://\u0026lt;HOST\u0026gt;:\u0026lt;PORT\u0026gt;;CombinePrepareAndExecute=1;Schema=ki_home; userName: \u0026lt;USER\u0026gt; password: \u0026lt;PASSWORD\u0026gt; 4. Run the rosetta extract command to generate the DBML models from PostgreSQL tables\nrosetta extract -s pg Since now we have the DBML models, we can review it, and use it for the next steps. The generated DBML models are ready to be converted to the target DDL and executed to the target DB.\n5. Run rosetta compile to generate the DDLs for KineticaDB\nrosetta compile -s pg -t kinetica 6. Review the generated files, if everything is as expected you can apply these changes to the target DB (KineticaDB)\n7. Run rosetta apply to generate the tables in KineticaDB\nrosetta apply -s kinetica This step will create the schema and all the tables in KineticaDB.\n8. If you continue to introduce changes in your PostgresDB and want to apply the same changes in your KineticaDB, then you simply run the steps:\nrosetta extract -s pg rosetta compile -s pg -t kinetica rosetta apply -s kinetica or (if you want to skip the review of the generated DDL):\nrosetta extract -s pg -t kinetica rosetta apply -s kinetica The above steps will only apply the changes based on the diff that the current version of DB has compared to the new model.yaml.\nTo verify the difference between two version you can use the diff command:\nrosetta diff -s kinetica The migration to high-performance analytics databases like KineticaDB, facilitated by tools such as RosettaDB, represents a transformative step for businesses aiming to enhance their data analytics capabilities. While the journey involves challenges, the strategic use of technology can mitigate risks and maximize the effectiveness of data resources in the digital age. By understanding the tools and techniques available for these migrations, companies can better position themselves to take advantage of the opportunities presented by modern data analytics platforms.\n","date":"2024-05-08T00:00:00Z","image":"https://rosettadb.io/blogs/efficient-change-schema-capture-csc-and-schema-translations-with-rosettadb/rosettadb_hu_8229888061988c7a.png","permalink":"https://rosettadb.io/blogs/efficient-change-schema-capture-csc-and-schema-translations-with-rosettadb/","title":"Efficient Change Schema Capture (CSC) and Schema Translations with RosettaDB"},{"content":" # Migrating your database to Snowflake with Open Source. In this webinar we\u0026rsquo;ll discuss some challenges with database migration and explore how we can leverage RosettaDB, Jupyter Notebooks, and Spark to migrate both database objects and the associated data.\nWe’ll discuss some of the challenges one typically faces with database migrations. We’ll explore RosettaDB as an open source alternative to help solve some of these challenges, and we’ll walk though an end-to-end demo of migrating both Postgres and MySQL databases to Snowflake.\n","date":"2023-09-28T00:00:00Z","permalink":"https://rosettadb.io/videos/migrating-your-database-to-snowflake-with-open-source./","title":"Migrating your database to Snowflake with Open Source."},{"content":"To migrate your PostgreSQL database to MySQL using Rosetta, you can follow these simple steps:\nInstall the required JDBC drivers for both PostgreSQL and MySQL databases.\nDownload and install Rosetta on your system.\nConfigure Rosetta to connect to your PostgreSQL and MySQL databases in a YAML config file. Here’s an example of how you can set up connections in the YAML config file:\nconnections: - name: postgres_prod databaseName: mydatabase schemaName: public dbType: postgres url: jdbc:postgresql://localhost:5432/mydatabase userName: user password: pass - name: mysql_prod databaseName: mydatabase schemaName: myschema dbType: mysql url: jdbc:mysql://localhost:3306/mydatabase userName: user password: pass Use Rosetta to generate DDL from your PostgreSQL database and transpile it to MySQL by running the following command: rosetta generate --source=postgres_prod --target=mysql_prod --output-dir=./mysql_ddl This will generate the MySQL DDL files in the ./mysql_ddl directory.\nExecute the generated DDL files on your MySQL database to create the required tables, indexes, and other objects. ","date":"2023-05-20T00:00:00Z","permalink":"https://rosettadb.io/use-cases/migrate-from-postgresql-to-mysql/","title":"Migrate from PostgreSQL to MySQL"},{"content":"To test the accuracy of your migrated data in RosettaDB, you can follow these general steps:\nDefine your expected results: Before performing any testing, define what the expected data results should be after migration.\nSelect a representative sample of data: Choose a representative sample of data from your source database that includes all types of data (e.g., text, numeric, date/time, etc.) and represents a typical use case.\nMigrate the sample data: Use RosettaDB to migrate the selected sample data from your source database to the target database.\nVerify the migrated data: Once the migration is complete, verify the migrated data in the target database against the expected results defined in step 1.\nPerform additional testing: If necessary, perform additional testing on other parts of your data or with different sets of data.\nDocument your findings: Keep track of any issues or discrepancies found during testing, and document how they were resolved.\nRepeat testing: After resolving any issues found during testing, repeat the testing process to ensure that the migration was successful and accurate.\n","date":"2023-05-19T00:00:00Z","permalink":"https://rosettadb.io/use-cases/test-data-accuracy/","title":"Test Data Accuracy"},{"content":"To generate Spark Python and Scala data transfer code to RosettaDB, you can follow these steps:\nInstall the required JDBC drivers for your source and target databases.\nDownload and install Rosetta on your system.\nConfigure Rosetta to connect to your source and target databases. You can do this by updating the YAML config file with the connection details for each database.\nUse Rosetta to generate DDL from your source database and transpile it to your desired target. You can do this by running the Rosetta CLI command with the appropriate arguments.\nOnce you have generated the DDL, you can use Spark Python or Scala to transfer the data between the source and target databases. You can do this by writing code that reads data from the source database using Spark SQL or DataFrame APIs and writes the data to the target database using JDBC.\n","date":"2023-05-18T00:00:00Z","permalink":"https://rosettadb.io/use-cases/generate-spark-python-and-scala-data-transfer-code/","title":"Generate Spark Python and Scala Data Transfer Code"},{"content":"To generate DDL in Rosetta, you can follow these steps:\nInstall the required JDBC drivers for your source and target databases.\nDownload and install Rosetta on your system.\nConfigure Rosetta to connect to your source and target databases. You can do this by updating the YAML config file with the connection details for each database.\nUse the rosetta generate command to generate DDL from your source database. The syntax of the command is as follows:\nrosetta generate --source=\u0026lt;source_db_type\u0026gt; --target=\u0026lt;target_db_type\u0026gt; Replace \u0026lt;source_db_type\u0026gt; with the type of your source database (e.g., mysql, postgres, oracle, etc.), and replace \u0026lt;target_db_type\u0026gt; with the type of your target database.\nYou can also specify the following optional parameters:\n\u0026ndash;output=\u0026lt;output_file\u0026gt;: Specify the name of the output file where the generated DDL will be written. If not specified, the DDL will be written to stdout.\n\u0026ndash;logging-level=\u0026lt;logging_level\u0026gt;: Set the logging level (debug, info, warn, or error). Default is info.\n\u0026ndash;tables=\u0026lt;table_list\u0026gt;: Specify a list of tables to generate DDL for. If not specified, DDL will be generated for all tables in the source database.\n\u0026ndash;schemas=\u0026lt;schema_list\u0026gt;: Specify a list of schemas to generate DDL for. If not specified, DDL will be generated for all schemas in the source database.\nHere’s an example command to generate DDL for a MySQL source database and a Postgres target database:\nrosetta generate --source=mysql --target=postgres \\ --output=ddl.sql \\ --tables=my_table_1,my_table_2 \\ --schemas=my_schema_1,my_schema_2 In this example, we’re generating DDL for two specific tables (my_table_1 and my_table_2) and two specific schemas (my_schema_1 and my_schema_2). The generated DDL will be written to a file called ddl.sql.\nOnce you have the generated DDL, you can execute it on your target database to create the necessary schema and tables. You can use any SQL client or tool to execute the DDL script. Note that Rosetta generates declarative DBML models that can be used for conversion to alternate database targets. However, the generated DDL may require further modifications and optimizations to suit your specific use case and database configurations.\n","date":"2023-05-15T00:00:00Z","permalink":"https://rosettadb.io/use-cases/generate-ddl-rosettadb/","title":"Generating DDL"},{"content":" # Free Open Source database migration Migrating from MS SQL Server to Snowflake? Save time and money by migrating your database with Open Source software. This tutorial, Part 1, covers how you can migrate your database objects from Microsoft SQ Server to Snowflake with RosettaDB.\n","date":"2023-04-13T00:00:00Z","permalink":"https://rosettadb.io/videos/free-open-source-database-migration/","title":"Free Open Source database migration"},{"content":"Liquibase allows you to specify the database change you want using SQL or several different database-agnostic formats, including XML, YAML, and JSON. Developers can abstract the database code to make it extremely easy to push out changes to different database types.\nRosettaDB is an open source declarative data modeler and transpiler (https://github.com/rosettadb/rosetta#overview) that converts database objects from one database to another. Define your database in DBML and RosettaDB generates the target DDL and executes it for you. RosettaDB is used also as DBT Model Generator and Database Testing toolkit for your data.\nIn this blog we are going to show a step-by-step solution for declarative data modeling in Google Spanner using Liquibase and using RosettaDB. As a result we will compare both tools. The process that we will perform with both tools is as follow:\nCreate new project Configure Extract the current state of the Google Spanner Introduce some changes Apply the changes Rollback In order to achieve the above requirements these are the steps we have to follow with Liquibase:\n1. Download and configure Liquibase on your machine. Download all the required JDBC drivers. For more details on this step please refer to the Getting Started section of Liquibase docs https://docs.liquibase.com/start/home.html\n2. Create a new liquibase project using the init command\nliquibase init project 3. Edit the liquibase.properties to configure the connection for Google Cloud Spanner. Example:\nchangeLogFile=my_db_changelog.json liquibase.command.url=jdbc:cloudspanner:/projects/my_project/instances/my_instance/databases/SimpleDB liquibase.command.username: root liquibase.command.password: root 4. Capture the current state of your database by creating a deployable Liquibase changelog using the command\nliquibase generateChangeLog --changeLogFile=my_actual_state_db_changelog.json Now since we have the actual state of our database we can start to make our first changes by creating our first changeset in our changelog file.\n5. We now add a new table for our database by creating a new changeset in our changelog file\n{ \u0026#34;databaseChangeLog\u0026#34;: [{ \u0026#34;changeSet\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;123456789-1\u0026#34;, \u0026#34;author\u0026#34;: \u0026#34;root\u0026#34;, \u0026#34;changes\u0026#34;: [ { \u0026#34;createTable\u0026#34;: { \u0026#34;columns\u0026#34;: [ { \u0026#34;column\u0026#34;: { \u0026#34;constraints\u0026#34;: { \u0026#34;nullable\u0026#34;: false, \u0026#34;primaryKey\u0026#34;: true, \u0026#34;primaryKeyName\u0026#34;: \u0026#34;PRIMARY_KEY\u0026#34; }, \u0026#34;name\u0026#34;: \u0026#34;LogId\u0026#34;, \u0026#34;type\u0026#34;: \u0026#34;INT64\u0026#34; } }, { \u0026#34;column\u0026#34;: { \u0026#34;name\u0026#34;: \u0026#34;Description\u0026#34;, \u0026#34;type\u0026#34;: \u0026#34;STRING(MAX)\u0026#34; } } ] , \u0026#34;tableName\u0026#34;: \u0026#34;Logs\u0026#34; } } ] } }, ]} 6. After we add our new changes to our changelog we then update the actual state of our database by using the command\nliquibase update This command will deploy the changes to our database and we are set with the next set of commands.\n7. Now we can automatically roll back our database last change by running the Liquibase rollback command like this\nliquibase rollbackCount 1 This command will remove the last changeset from our actual database, with this we roll back our database.\nWith the above steps we demonstrated the process of how we use Liquibase to extract the current state of the database, change the database, use the update command to update the database, restore/rollback to previous state.\nNow, we are going to show, how the same process can be performed by using RosettaDB: https://github.com/rosettadb/rosetta\nIn order to achieve the above requirements these are the steps we have to follow with RosettaDB:\n1. Download and configure RosettaDB on your machine. Download all the required JDBC drivers. For more details on this step please refer to the Getting Started section of RosettaDB docs https://github.com/rosettadb/rosetta#getting-started\n2. Create a new rosetta project using the init command\nrosetta init cloudspanner_project 3. Edit the main.conf to configure the connection for Google Cloud Spanner. Example:\nconnections: - name: cloudspanner_conn databaseName: SimpleDB schemaName: dbType: spanner url: jdbc:cloudspanner:/projects/my_project/instances/my_instance/databases/SimpleDB userName: password: 4. Run the rosetta extract command to generate the DBML models from Google Cloud Spanner tables\nrosetta extract -s cloudspanner_conn Since now we have the DBML models, we can review it, and use it for the next steps. We are ready to add new changes to our database.\n5. We now add a new table for our database by adding a new table in our DBML model inside the tables property\n--- safeMode: false tables: - name: \u0026#34;Logs\u0026#34; type: \u0026#34;TABLE\u0026#34; schema: \u0026#34;SimplePOS\u0026#34; columns: - name: \u0026#34;LogId\u0026#34; typeName: \u0026#34;INT64\u0026#34; ordinalPosition: 0 primaryKeySequenceId: 1 columnDisplaySize: 0 scale: 0 precision: 5 autoincrement: false nullable: false primaryKey: true - name: \u0026#34;Description\u0026#34; typeName: \u0026#34;STRING(MAX)\u0026#34; ordinalPosition: 0 primaryKeySequenceId: 0 columnDisplaySize: 0 scale: 0 precision: 45 autoincrement: false nullable: false primaryKey: false databaseProductName: \u0026#34;SimpleDB\u0026#34; databaseType: \u0026#34;spanner\u0026#34; After adding the new table in the DBML model we then use the apply command to update our database\nrosetta apply -s cloudspanner_conn This command will deploy the changes to our database and we are set with the next set of commands.\n7. Now we can automatically roll back our database by choosing a previous state from our database in the snapshots directory and by running the RosettaDB apply command like this:\nrosetta apply -s cloudspanner_conn -m snapshots/model-20230215-121137.yaml With the above steps we demonstrated the process of how we can use RosettaDB as a declarative data modeler and as DDL transpiler so we can add/change/update our database in a few steps.\nLiquibase vs. RosettaDB\nLiquibase RosettaDB Getting started No account needed, you can download the latest release from the Liquibase web. No account needed, you can download the latest release from GitHub Configuration Uses liquibase.properties to specify the connection string and changeLogFile. Uses main.conf to specify the connections. Extracted Schema It can be in various formats XML, JSON, YAML or SQL. It supports only YAML. Changes You need to create a new changelog file and add a changeset to define your changes. You update the extracted model.yaml based on your changes. Apply Changes #Register the changelog liquibase registerChangeLog\n#Update the current state liquibase update #Apply the changes based on the current state of model.yaml rosetta apply -s \u0026lt;CONNECTION_NAME\u0026gt; Rollback Rollback to a specific changeset or version. liquibase rollbackCount 1\nliquibase rollback -tag=1.0.0 Rollback to any version. Before each apply, it generates a snapshot with the current state. rosetta apply -s cloudspanner -m snapshots/model-20230215-121137.yaml DDL Transpiler NO YES DBT Modeler NO YES Database Testing NO YES Supports View YES YES Supports Interleaved Tables NO YES ","date":"2023-02-28T00:00:00Z","image":"https://rosettadb.io/blogs/comparing-liquibase-with-rosettadb/rosettadb-liquibase_hu_e6261af0c3dc625b.png","permalink":"https://rosettadb.io/blogs/comparing-liquibase-with-rosettadb/","title":"Comparing Liquibase with RosettaDB"},{"content":"Engineers today use BigQuery to build data warehouses because these are optimised for analytical queries and better performance on huge amounts of data. Assume a company has all its transactional data on a Postgres Database and wants to build a data warehouse in BigQuery in a few steps.\nWe are going to show, how this can be performed by using RosettaDB: https://github.com/rosettadb/rosetta\nRosettaDB is a declarative data modeler and transpiler that converts database objects from one database to another. Define your database in DBML and rosetta generates the target DDL for you.\nWith the help of RosettaDB, we will:\nInitialise a new project Configure a connection to the source DB which in his case is PostgreSQL Configure a connection to the target data sources which is BigQuery Extract the current schema for the targeted tables from PostgreSQL and generate the declarative DBML models Convert the generated DBML models from step #3, generate the DDLs for the target DB and apply the changes In order to achieve the above requirements these are the steps we have to follow with RosettaDB:\n1. Download and configure RosettaDB on your machine. Download all the required JDBC drivers. For more details on this step please refer to the Getting Started section of RosettaDB docs https://github.com/rosettadb/rosetta#getting-started\n2. Create a new rosetta project using the init command\nrosetta init [PROJECT_NAME] 3. Edit the main.conf to configure the connection for the PostgreSQL and BigQuery\nExample:\nconnections: - name: pg databaseName: postgres schemaName: rosseta_testing dbType: postgres url: jdbc:postgresql://\u0026lt;HOST\u0026gt;:\u0026lt;PORT\u0026gt;/\u0026lt;DATABASE\u0026gt;?user=\u0026lt;USER\u0026gt;\u0026amp;password=\u0026lt;PASSWORD\u0026gt; userName: \u0026lt;USER\u0026gt; password: \u0026lt;PASSWORD\u0026gt; tables: - \u0026lt;TABLE_1\u0026gt; - \u0026lt;TABLE_2\u0026gt; - name: bq databaseName: bigquery-public-data schemaName: austin_311 dbType: bigquery url: jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=\u0026lt;PROJECT_ID\u0026gt;;AdditionalProjects=bigquery-public-data;OAuthType=0;OAuthServiceAcctEmail=\u0026lt;EMAIL\u0026gt;;OAuthPvtKeyPath=\u0026lt;SERVICE_ACCOUNT_KEY_PATH\u0026gt; userName: password: 4. Run the rosetta extract command to generate the DBML models from PostgreSQL tables\nrosetta extract -s pg Since now we have the DBML models, we can review it, and use it for the next steps. The generated DBML models are ready to be converted to the target DDL and executed to the target DB.\n5. Run rosetta compile to generate the DDLs for BigQuery\nrosetta compile -s pg -t bq 6. Review the generated files, if everything is as expected you can apply these changes to the target DB (BigQuery)\n7. Run rosetta apply to generate the tables in BigQuery\nrosetta apply -s bq With the above steps we demonstrated the process of how you can use RosettaDB as a declarative data modeler and as DDL transpiler so you can build your data warehouses in just a few steps.\n","date":"2023-02-21T00:00:00Z","image":"https://rosettadb.io/blogs/declarative-data-modeling-for-your-data-warehouses-in-bigquery-using-open-source-rosettadb/rosettadb_hu_8229888061988c7a.png","permalink":"https://rosettadb.io/blogs/declarative-data-modeling-for-your-data-warehouses-in-bigquery-using-open-source-rosettadb/","title":"Declarative Data Modeling for your Data Warehouses in BigQuery using Open Source RosettaDB"},{"content":" # Database Migration with RosettaDB Learn how to migrate your database to the cloud with ease!\nMigrating databases to the cloud can be a daunting task. With RosettaDB you can migrate all your tables in one go! In this webinar we explore how you can migrate your on-prem database to the cloud using a simple no-code approach without having to fuss with SQL.\nIn this video we show how you can do the following:\nExtract a MySQL database model (Sakila) Deploy the database model (Sakila) to BigQuery Generate DBT models for ELT Perform declarative modeling changes to the schema Load data Test the database with a set of tests Helpful Links:\nGitHub Link - https://github.com/rosettadb/rosetta\nDownload - https://github.com/rosettadb/rosetta\nProject Sponsor - Rosetta Labs\n","date":"2022-09-28T00:00:00Z","permalink":"https://rosettadb.io/videos/database-migration-with-rosettadb/","title":"Database Migration with RosettaDB"},{"content":" # ProEDMS v1.0 Demo Rosetta Labs is a data cataloging and data discovery system that enables data governance for DataOps via tagging and labels, schema evolution, data extraction, column masking, data profiling, and more.\n","date":"2022-07-13T00:00:00Z","permalink":"https://rosettadb.io/videos/proedms-v1.0-demo/","title":"ProEDMS v1.0 Demo"},{"content":" # Data Transfer ","date":"2022-07-12T00:00:00Z","permalink":"https://rosettadb.io/videos/schema-evolution/","title":"Data Transfer"},{"content":" # Schema Evolution ","date":"2022-07-12T00:00:00Z","permalink":"https://rosettadb.io/videos/data-transfer/","title":"Schema evolution"},{"content":"","date":"2019-03-05T00:00:00Z","permalink":"https://rosettadb.io/whitepapers/example/","title":"whitepaper 1"}]