Insights Into Fast Big Data in Semiconductors – with Bill Pierson

6 June 2019 | 10 minutes

A conversation with Bill Pierson, VP, Semiconductor, and Manufacturing at KX.

Bill Pierson is VP of Semiconductors and Manufacturing at KX, leading the growth of this vertical. He has extensive experience in the semiconductor industry including previous experience at Samsung, ASML and KLA. Bill specializes in applications, analytics, and control. He lives in Austin, Texas, and when not at work can be found on the rock climbing cliffs or at his son’s soccer matches.

The semiconductor manufacturing industry seems to be embracing fast data very enthusiastically. Why is that?

I know every generation seems to think the world is becoming faster, but it is certainly true of the current world of semiconductor manufacturing. And it’s particularly true of its data side, where fast big data has become critical in monitoring, controlling and improving the manufacturing processes.

Ironically, what amplifies the need for speed in semiconductor manufacturing is the fact that it is a relatively slow process to make a chip – taking up to 100 days to just produce a wafer. This does not include the rest of the supply chain for testing, packaging and end-use applications. Therefore, a problem identified on day one can save a lot of time, money and waste further down the line. To make that more real, if you take as your starting point that improving output by just 0.3% equates to one day’s production per year, and noting that daily manufacturing costs in a semiconductor fabrication plant, also called a fab, can run to millions of dollars, it is clear that big savings can be made from such improvements – if they can be identified.

What do you mean by “if they can be identified”?

It’s a classic big data problem. Machines, via sensors, are producing huge amounts of data on their status and their operational health, but the challenge is capturing it all, interpreting what parts of the data should be used, and combining them with other relevant fab data. The difficulty for the semiconductor world, and indeed the manufacturing industry in general, is that their incumbent technologies cannot cope with the massive volume and velocity of data involved. They were simply not built for these demands and, as machines increase both the number of sensors and their output frequency, those data challenges are increasing too.

But the good news, and the reason the semiconductor world is embracing fast big data, as you mentioned, is that they are now well-positioned to address it. There are two reasons why.

One reason is that the semiconductor industry is strong. Revenues are approaching $500B/yr,  and that trend will prevail as semiconductors continue to permeate almost every fabric (literally) of our lives, from the clothes we wear to the cars we drive (or drive us). In addition, the demand for integrated circuits is becoming almost insatiable – just look at the proliferation of data centers for cloud-based processing; the explosion of IoT sensors in the home and community, and the rollout of wireless infrastructure for 5G, as examples. So it’s an industry that has both the motivation, and the capacity, to invest in data collection, data analysis, and data management technology.

The other reason the semiconductor industry is embracing fast big data is that the required technology exists – kdb+ from KX. Kdb+ is a high-performance database commonly used with fast big data. For decades, financial institutions have used it to power sophisticated, high-frequency algorithmic trading and high-speed analytics. It is now being adopted in other industries, particularly for IoT sensor data, from utilities and Formula 1 racing to manufacturing and telco, all of which face similar fast big data challenges using their existing technologies.

Why is KX particularly suited to data challenges facing semiconductor manufacturers?

In the first instance, kdb+ is a time-series database – the world’s fastest in fact. That’s particularly important as manufacturing data is predominantly time-stamped. On the fab floor, its sources include process tools, metrology tools, and manufacturing execution systems; in the subfab, they include pumps, chillers, gas boxes, and facility operations; in other areas or buildings, they can include test, sort, and packaging. Consolidating this data in order to create a connected factory of interdependent processes and solutions is not unlike how Wall Street uses KX to amalgamate streaming prices and orders over multiple exchanges, and across different currencies, to create sophisticated algorithmic trading strategies.

The ability to process the massive volumes of data involved plays to KX’s strengths. Chamber tools have ever-increasing numbers of sensors (many hundreds) measuring things such as gas flow, pressure, temperature, voltage using in-situ metrology techniques such as optical endpoint detection and other sensors to control processes. New tools will include even more data outputs, especially when talking about new technologies like EUV Lithography or Atomic Layer Etch and Deposition. The sensor data volumes dwarf what incumbent systems can manage, and the problem is exacerbated by its speed, with sensor frequencies ranging from 100 Hz (100 times per second) to kHz in high-speed readings like pump vibrations

Being able to process such data in-memory is critical from a performance viewpoint, but the ability to combine it with historical data on disk is equally critical, from both a scalability and an analytics perspective, as the volumes involved will ultimately exceed memory capacity. In addition, by using KX’s own in-built powerful query and programming language, q, you can operate directly on the data rather than having to extract or move it between processes as you do in most other technologies. As a result, it enables the lightning-fast performance that is needed for real-time detection and predictive analytics.

The storage and scalability capabilities of KX make it an ideal platform for machine learning programs too, especially since it provides interfaces to technologies like SciKitLearn, Theano, and Keras, as well as enabling continued use of existing libraries written in Python using embedPy and PyQ.

How does KX provide value to the semiconductor manufacturing industry?

In short, we provide the ability to solve problems more efficiently using data.

Consider the point I already mentioned about improving production by 0.3%. Process tools are becoming extremely expensive, with price tags in the 10’s of millions of dollars, so real-time monitoring is critical for identifying failures in order to increase uptime and improve productivity. Possibly more important is the ability to perform predictive analytics that enables you to avoid those tool failures in the first place, rather than just being reactive in identifying when they happen. That requires capturing and processing gigabyte volumes of tool sensor data, along with the metrology and/or yield-related data for identifying deviations and variations in process tools, and their impact in wafer production and quality.

It’s important to note that this applies not only to manufacturing 300mm wafers in newer, more expensive fabs, but also to the production of 200mm wafers in smaller and older fabs, as the variety of semiconductors needed in everyday life increases and we see a resurgence of their utilization.

What use cases does KX provide right now?

I will give you two examples that tap into the unparalleled performance of kdb+.

In the first instance, we worked with application providers where they had difficulty managing large amounts of data in real-time and used KX technology to power their internal applications. We were able to provide speed improvements up to 100 times faster than alternatives, enabling large queries over many months of data, and reducing standard queries that were previously taking minutes, to process in just a handful of seconds. These queries were used for monitoring, control, root cause analysis and predictive analytics, and combined streaming data in-memory with historical data on-disk to capture the smallest multivariate deviations to process and tool fingerprints.

The second case relates to performance as well. KX technology can provide superior performance on a much smaller footprint using commodity hardware. For example, we have demonstrated environmental cost savings of greater than 90% compared to a legacy system used by a fab, thereby enabling them to optimize their infrastructure for the workload by implementing the data processing, storage, and analytics in a small appliance footprint. We are currently in the process of implementing sensor data management solutions for both the older 200mm fabs and the largest and newest 300mm fabs in the world.

So the summary is KX technology includes the fastest ingestion speeds, enabling hardware costs to be minimized, providing a superior performance for queries and calculations, and offering robust high availability (HA) capability to support 24×7 manufacturing. Our performance metrics include 100+ million sensor readings per second ingestion, with over 1000 queries processed per second concurrently, and less than 10-second failover with a Hot-Warm HA environment.

What is next for KX in the semiconductor industry?

One big opportunity for the semiconductor industry is breaking down the data silos between and within various segments: design, front-end manufacturing, yield/test, and packaging. This is critical because the time to market for new chips and new technology is moving ever faster, requiring the cycles of learning to be more efficient. Tying the manufacturing data to yield/test is already a common use case in order to improve manufacturing processes. Today, leaders are working to better connect the data in each segment for their engineers to improve processes and perform root cause diagnosis throughout the supply chain. Additionally, there is a growing interest in having traceability of the packaged chip all the way back to the silicon ingot, which will require common interface systems and methodologies. KX is starting to participate with SEMI, a global industry association working groups to support these initiatives and engaging with independent device manufacturers (IDMs) to provide these solutions.

Within the Front-End Manufacturing ecosystem, we see great opportunities too. Vendor leaders have invested in their own version of equipment centric process control, sometimes called advanced equipment process control (AEPC).  Using KX, and extending beyond managing sensor data in isolation to include wider fab data, specifically metrology/inspection data and potentially yield and test data, enables them to develop better and faster-advanced process control (APC) solutionsAs an example, we have begun working with a partner on combining metrology data with sensor data from process tools utilizing the newest e-beam inspection tools where root cause analysis can be converted to APC or AEPC solutions. Processing the enormous volumes of sensor data and metrology/inspection data in this way provides the ability to detect defects, measure critical dimension (CD) uniformity, and analyze edge placement errors (EPE’s) for many areas of the chip and wafer, and to connect them to process and tool responses to improve yield and lower costs. Engineers can use KX technology to efficiently manage this data, develop solutions, and act to solve problems.

Further, fast big data use cases will evolve as the semiconductor industry continues to grow as technology becomes more complicated and expensive, requiring real-time solutions to minimize costs in this highly competitive landscape. The ability to solve and exploit them will depend on having the appropriate technologies to address the data challenges. I think KX is well-positioned to be an important tool in that mission.

Demo kdb, the fastest time-series data analytics engine in the cloud








    For information on how we collect and use your data, please see our privacy notice. By clicking “Download Now” you understand and accept the terms of the License Agreement and the Acceptable Use Policy.