1. Objective¶
The main objective of this project was to create a RL-based semiconductor macro placement
solution. Before making it into a solution, we wanted to conduct PoC to check feasibility of
this solution.
During the semiconductor process, we focused on the physical design, especilaly in
floorplanning.
We conducted market research and initial product planning during this project. Because of
secruity issues, I can't fully disclose all the information.
2. Data¶
We were given multiple semiconductor data (.v, .lef, .def, .tcl). We initially spent some
time studying the industry to understand the data. We were given following data:
Verilog File : Verilog is a hardware description language (HDL) used to model and
simulate digital systems. v files contain the source code written in Verilog, which
describes the behavior and structure of digital circuits.
LEF File : LEF is a standard file format used to describe the physical and
geometrical properties of the components in an integrated circuit library. .lef files
contain information such as cell footprints, layer geometries, pin positions, and
routing constraints.
DEF File : DEF is another standard file format used in the semiconductor industry
to represent the physical layout of an IC design. def files contain detailed information
about the placement and routing of components, metal layers, vias, and other physical
structures.
We extracted useful information from these files such that they can be used in RL.
The final output is a
TCL File. Tcl is a scripting language commonly used in the
semiconductor industry, particularly in electronic design automation (EDA) tools, which
contains each macro's position and orientation. We used ICC2 to verify our results.
3. Reinforcement Learning¶
The main skills I used in this project are : PyTorch, Dreamplace, CNN,
Verilog, Pointer Network, GNN.
RL MDP
State: The image is the rendering of the canvas with macro placement and
each macro's pin placement.
Therefore, used Convolutional Neural Network. We also included meta information about
the
specific macro.
Action: Action is the multiplication of grid size and rotation types.
Reward: Reward is considering all wirelength, congestion, and hierarchy
area. From the input data, we can extract the hierarchy of the macros. It is better for
the
macro with the same hierarchy to be close to one another.
4. Problems and Solutions¶
To list some of the problems faced during the project and the solutions used to overcome:
-
Problem: Sparse rewards occur due to the structure of environments where a reward
is received only after more than 100 steps.
Solution 1 : Conduct research on exploration methodologies. Afterward,
incentivize
exploration by providing intrinsic rewards based on the distance between the current
state and the past states (n in total) in terms of their embeddings.
Solution 2 : Ultimately, we completely resolve the issue of non-convergence and
oscillation, while simultaneously enabling the training of a model that reduces domain
rewards such as wirelength, congestion, and density.
-
Problem: Convergence occurs early in training, resulting in insufficient
exploration.
Solution 1 : Reduce the learning rate and increase the entropy ratio to encourage
more exploration.
Solution 2 : Address the gradient vanishing problem in the CNN model by adding
max pooling layers.
-
Problem: New to the semiconductor business, so the problem was not only to solve
the problem but find ways to get into the market.
Solution 1 : Collaborated with the Securing domain knowledge through
collaboration with domestic companies and maintaining business partnership
relationships.
Solution 2 : Visited Silicon Valley to meet with famous EDA companies to talk
about our work. Had the opportunity to present our work in front of them to obtain
feedback to strengthen the product.