Tableau Server delivers powerful capabilities through processes that govern extract refreshes, database connections, workbooks, data sources etc. Since so many processes are involved, it is recommended to optimize Tableau Server for better performance. In this blog we will be looking at some of the steps that help you optimize Tableau Server’s performance.
1. Using Published Data Sources (PDS)
Instead of connecting to databases every time a workbook is created, try to leverage published data sources to supply information to multiple workbooks. Leveraging published data sources provides the following benefits:
- Being a single source of Data for multiple workbooks PDS saves lot of space in Tableau Server whereas in case of workbook embedded data sources space is consumed based on volume of data in each workbook
- Processing load is optimized in the server when multiple workbooks are used
- Creating formula and calculations for the Data is a onetime activity in Tableau Desktop. These modifications are saved in Published Data Source therefore need not be re-created every time a new workbook is developed
- Data refresh process & orchestration becomes simpler as data refresh of a PDS is reflected across all workbooks connected to it negating the need for setting multiple data refreshes for each workbook which may take lot of time
- Easily connect to data source for creating new workbooks even without connection to Databases.
It is also essential to understand that modifications cannot be made to Published Data Sources. E.g. Editing formulae, creating new calculations etc.
2. Use Row Level Security (RLS)
Sometimes, developers create multiple versions of the same workbook catering to multiple user roles (e.g. Manager – Eastern Region should only see data pertaining to that region). They achieve this by creating filtered version of the data set catering to each role. However, this consumes lot of space in Tableau Server and hampers its performance. In addition, future changes to the dashboard need to be applied to all the versions thereby taking time & effort.
By utilizing Row Level Security (RLS) for a workbook different users can view the same dashboard that restricts the data as per their roles.
3. Schedule Data Refresh during Non-Business Hour
Scheduling data extract refreshes during office hours may take up longer time as multiple processes might be utilizing the databases and tables.
Therefore, it is a best practice to set data refresh schedules during non-business hours which ensures faster data refresh
4. Prioritize Schedule Refreshes
There may be many data refreshes pointing to different workbooks in a refresh schedule. Some of the data refreshes might involve huge amount of data that take lot of time to refresh data extracts while other data refreshes would have lesser data and therefore would take lesser time.
By prioritizing data refreshes, we can set the sequence in such a way that data refresh for the workbooks with fewer data is triggered first followed by the ones involving large amount of data. Therefore, we can access the reports with lesser data sooner.
5. Favor Incremental over Full Refresh whenever possible
Incremental Refresh appends new records to existing records in extracts whereas a full refresh deletes existing records loads old records along with the new records.
It is advisable to go for incremental refresh as it takes much lesser time when compared to a full refresh, unless there is a mandatory business requirement need to reload all the data.
6. Retain Data in Cache memory
While installing Tableau Server, under Data Connections Tab the user is prompted to choose method of handling cache.
a) Refresh Less Often – Data from source is cached. Subsequently every time the report is accessed data from cache is displayed. This is done to reduce the load on Tableau Server by not sending query to database every time report is accessed thereby improving performance. This option is best to be used for data that changes less often. Latest Data is reflected in the report only when the report is manually refreshed, or when Tableau Server is restarted.
b) Balanced – The user can specify the time up to which data is cached. Data is not held in the cache beyond the time specified.
c) Refresh More Often – In-case of live connection to data source, every time the report is accessed data refresh will take place in the background before displaying the report. In case of extract connection data from latest version of refreshed extract will be fetched. Although this helps us to view the latest data, there will be an additional load on Tableau Server to fetch data from the data source every time the report is viewed. This option can be chosen when new records are added to data source in very short intervals
These are some of the options to optimize your Tableau Server for better performance.
* * *
Learn more about Visual BI’s Tableau consulting & end user training programs here.
Subscribe to our Newsletter
The post 6 Best Practices for Efficient Tableau Server Performance appeared first on Visual BI Solutions.