The growth in data-driven decision making has accelerated the advance of analytic capabilities across the organization. The legal function is no exception, but few have the skills needed to leverage data analytics in legal. What’s needed are data scientists within the legal function.
Data scientists possess the necessary skill sets to successfully implement and use advanced analytics. Hiring legal and compliance data scientists will enable legal to take advantage of analytics capabilities to manage risk, improve processes and inform optimal resource allocation.
Insights from the Gartner legal and compliance research team outline the role, reporting structure and utilization of data scientists in the legal and compliance function.
The role of the data scientist in legal
Data scientists are individuals with advanced degrees in quantitative disciplines such as mathematics, computer science or operations research. They can execute complex data discovery and exploration, perform data analysis to extract knowledge or insight from data (structured or unstructured), and build complex predictive and prescriptive models.
Simply put, data scientists are adept at transforming large amounts of structured and unstructured data into insights, predictive analytics and machine learning (ML) models for decision automation.
Data scientists differ from legal experts and data engineers. Unlike legal experts, data scientists aren’t required to possess in-depth legal knowledge. In fact, the 2018 Gartner State of the Legal Function Survey showed that 14% of legal departments have hired a nonlawyer to serve as a dedicated legal data scientist or data analytics specialist.
The role of data engineers is to make the appropriate data accessible and available to data scientists.
Where to utilize data scientists in legal
Electronic discovery is the most widely used workflow for legal data analytics (most often for culling, early case assessment and privilege review), but legal workflows other than e-discovery have also widely adopted analytics, including matter management and information governance.
This indicates that legal and compliance usually deploys data scientists in these workflows.
Legal and compliance data scientists act as an important interface between the legal function and departments such as IT, facilitating the exchange of technological expertise and fostering analytics partnerships.
Some of the typical responsibilities of legal and compliance data scientists include:
- Structuring and analyzing legal and compliance data (such as documents and emails) to ensure high data quality and governance
- Using data to develop a deep understanding of various legal and compliance risks and build common analytic platforms for use throughout the legal departments
- Leading analytical projects aimed at internal process improvement (e.g., to accurately price outside counsel work or to eliminate overlaps in processes)
- Assisting and training legal and compliance staff to embrace technology, process improvement and data analytics in their practice
- Visualizing information and developing legal and compliance reports and dashboards based on the results of data analysis
The capability of data scientists to execute these responsibilities is determined by their experience. While junior data scientists excel in understanding data and building ML models, they have less strength and experience in areas such as mentorship, collaboration and project prioritization. In contrast, senior data scientists step into a role primarily focused on management.
When to hire a data scientist in legal
Legal should hire data scientists only once it has a sufficient number of legal analytics use cases, a solid foundation of data and technology, and a culture that supports advanced analytics.
Hiring data scientists without the analytical infrastructure and use cases to best leverage this talent can lead to increased personnel costs and suboptimal utilization of the available human resources. Premature selection will lead to legal data scientists spending significant time on mundane tasks like data cleansing and static reporting.
Instead, they should be employed to perform more sophisticated tasks such as building models, applying natural language processing, automating time narratives, auto-classifying data and using ML techniques.