Phong Yang
Back to case studies

Search · Enterprise

Enterprise Search & Document Processing

Backend & search infrastructure

Backend services, Solr-based search, and document pipelines for legal and tax research — performance, indexing, and a path from on-prem toward cloud.

JavaSpringPythonRESTSOAPOracleSQL ServerSolrElasticsearchAWS EC2S3JenkinsAngularJS

Overview

Developed backend services, search infrastructure, and document processing workflows for enterprise legal and tax research platforms serving large-scale content and retrieval needs.

Problem

Enterprise research products depend on accurate search, performant indexing, and reliable access to large volumes of structured and unstructured content. The challenge was to improve retrieval performance, integrate legacy systems, and support a gradual move toward more scalable infrastructure.

Architecture

The platform included Java and Python services, REST and legacy SOAP integrations, relational databases, indexing pipelines, and search infrastructure using Solr. My work also supported early cloud migration efforts and internal tools that improved operational workflows.

Content sourcesProcessing pipelineIndexingSearch APIResearch interface
Suggested figure: pipeline from ingestion through inference to product surfaces.

My role

I developed backend services, worked on APIs and service integrations, improved database performance, contributed to search systems, and supported cloud migration and CI workflows. This role built the foundation for my later work in distributed systems and platform engineering.

Technologies

Systems mixed Spring-based services, Python utilities, enterprise RDBMS, Solr indexing, nascent Elasticsearch usage, Jenkins-driven delivery, and early AWS workloads alongside AngularJS frontends.

Impact

  • Improved large-scale document retrieval performance by ~30%
  • Enhanced search relevance and response times for enterprise research systems
  • Contributed to migration from on-prem infrastructure toward AWS-based environments
  • Built reliable processing and indexing workflows for legal and financial content