🎵 Audio ControlNet

Fine-Grained Text-to-Audio Generation with Conditions

T2A GUI interface with conditional inputs for Audio ControlNet.


Control Inputs

  • Loudness: reference audio controlling energy / dynamics
  • Pitch: reference audio controlling pitch contour
  • Sound Events: symbolic event-level constraints in JSON format