Project Moonshot facilitates manual and automated red-teaming, incorporating automated attack modules based on research-backed techniques to test multiple LLM applications simultaneously.
Red-Teaming allows the adversarial prompting of LLMs to induce them to behave in a manner incongruent with their design.
As Red-Teaming conventionally relies on humans, it is hard to scale. Project Moonshot has developed some attack modules that enable automated prompt generation, which allows automated red teaming.