High performance VLSI architecture for 2-D DWT using lifting scheme Reduced area and high speed 2-D DWT structural design is presented here. To decrease the delay in critical path with one multiplier, minimum stages of pipeline stage required are four for a lifting step. To reduce the pipeline stages, here short modification is adopted by recombining and storing the intermediate stages of result value. By adopting this work, we can reduce register number without critical path extension and scanning architecture adopted is parallel two inputs/ two outputs architecture, and it increases the speed and critical path can be decreased to a delay of a multiplier by 3 pipeline stages. By adopting shift add method for multiplication, the proposed method able to decrease the delay in critical path to delay of a adder. The number of registers between column and row filter requires is three for this proposed architecture for NXN size 2-DWT.