A one-stop solution for making tests easy for educators, and ed-tech businesses
Find the Blind Spots in your Marketing Performance
Parse and Match Resume Data With Job Description in Bulk
Test and experience Computer Vision Implementation
Transform Tally Data to Power BI with Ease
Reimagine Digital Catalogues with Virtual TryOn
Make your retail outlet efficient and successful with data-driven insights
Latest blogs, news and
updates!
Collaboration driving business impact
Hands-on industry trends, insights and real-world collabs
Join us. Be a part of something great.
All about the story, vision, and team behind the Biz
Start your affiliate journey & earn big
A one-stop solution for making tests easy for educators, and ed-tech businesses
Find the Blind Spots in your Marketing Performance
Parse and Match Resume Data With Job Description in Bulk
Test and experience Computer Vision Implementation
Transform Tally Data to Power BI with Ease
Reimagine Digital Catalogues with Virtual TryOn
Make your retail outlet efficient and successful with data-driven insights
Latest blogs, news and
updates!
Collaboration driving business impact
Hands-on industry trends, insights and real-world collabs
Join us. Be a part of something great.
All about the story, vision, and team behind the Biz
Start your affiliate journey & earn big
“Mysql Server has gone away” OR “Lost Connection to Mysql Server during query”- These are some of the most common errors a developer of a DBA looks at on his/her computer screen, whenever data in a MySQL table exceeds beyond a particular limit.
It might be interesting to know about Google BigQuery Cost Optimization.
This is due to the limitation of MySQL to process a very large amount of data if the query is not written in optimized form. Or if the query has been optimized, still it may take very much I/O to scan data if data to be processed is very large. Normally in a company like a startup, where much of focus is not on optimizing things, its very difficult to make a query on some of the MySQL tables like “tb_transactions”, “tb_users” etc.
A high stream of real-time data is being pushed in these tables and that too very frequently. As a result, data becomes very large and Mysql shows sign of break down if in real time large data is being queried from MySQL tables.
In that scenario, different companies follow different strategies, depending on bandwidth and expertise they are having
As there is no free lunch in this world, so this convenience of processing and analyzing large Mysql data tables brings some cost associated with it. Google charges from its users on the basis of data processed per query (with a min bar of 200 MB per query). As a result, companies with huge data sets have to pay a large sum of dollars at every month end.
And the irony is, this cost gets increased with each extra bit of data being stored on BigQuery (each passing day) and with each extra bit of data being queried. But these large sums of money can be avoided with some clever query skills and with a little bit of awareness.
Below are some of the best methods which can help a company or a developer to reduce the BigQuery cost to a much greater extent, irrespective of how much data is been increasing daily.
Google provided this function for cost reduction purpose. Start storing the data in smaller tables per day. In place of storing whole transactional data in just one huge “tb_transactions” table, store in smaller tables like one “tb_transactions” table per day. eg “tb_transactions20170815” (for data of Aug 15, 2017). As a result, storing data in smaller tables will help to process only that much amount of data as required by the end user.
We just need to pass the date range explicitly in TABLE_DATE_RANGE(). Hence query execution cost will decrease per query. For info about this function and its syntax visit. Google BigQuery Reference
It means select only required columns from a table in place of selecting all the columns or ‘*’. For this point, first, we have to understand how Google charges money. Google charges on the basis of total data (all rows of selected columns from a table) being processed as compared to only processed rows.
Each column from a table we are fetching is multiplied by its data type size and count of all the rows present in that table, to give the actual data for which we will be charged. e.g. let’s say we have an “id” col having data type as “int” (4 bytes) and that table has 1000 rows.
So querying even ‘select “id” from table_name limit 1’ will process 4*100 bytes data in place of expected 4*1 bytes data. This is because for BigQuery selecting a column means whole data (for that column) from the table to be processed and hence charged.
Using these two methods only, can help a company to bring down and optimize its BiqQuery Analytics costs to a much greater extent.
DataToBiz is a Data Science, AI, and BI Consulting Firm that helps Startups, SMBs and Enterprises achieve their future vision of sustainable growth.
DataToBiz is a Data Science, AI, and BI Consulting Firm that helps Startups, SMBs and Enterprises achieve their future vision of sustainable growth.
They have the experience and agility to understand what’s possible and deliver to our expectations.