Attributes
Functions related to attribute generation.
def join_attributes_desc(ids: list[str]) -> dict[str:dict]
Gets the attributes and description for given product IDs.
Args: ids: The product IDs to get the attributes for.
Returns dict mapping product IDs to attributes and descriptions. Each ID will map to a dict with the following keys: attributes: e.g. {‘color’:‘green’, ‘pattern’: striped} description: e.g. ‘This is a description’
def retrieve(desc: str,
category: Optional[str] = None,
image: Optional[str] = None,
base64: bool = False,
filters: list[str] = []) -> list[dict]
Returns list of attributes based on nearest neighbors.
Embeds the provided desc and (optionally) image and returns the attributes corresponding to the closest products in embedding space.
Args: desc: user provided description of product category: category of the product image: can be local file path, GCS URI or base64 encoded image base64: True indicates image is base64. False (default) will be interpreted as image path (either local or GCS) filters: category prefix to restrict results to
Returns: List of candidates sorted by embedding distance. Each candidate is a dict with the following keys: id: product ID attributes: attributes in dict form e.g. {‘color’:‘green’, ‘pattern’: ‘striped’} description: string describing product distance: embedding distance in range [0,1], 0 being the closest match
def generate_prompt(desc: str, candidates: list[dict]) -> str
Populate LLM prompt template.
Args: desc: product description candidates: list of dicts with the following keys: attributes: attributes in dict form e.g. {‘color’:‘green’, ‘pattern’: ‘striped’} description: string describing product
Returns: prompt to feed to LLM
def parse_answer(ans: str) -> dict[str, str]
Translate LLM response into dict.
Args: ans: ‘|’ separated key value pairs e.g. ‘color:red|size:large’ Returns: ans as a dictionary
def generate_attributes(desc: str, candidates: list[dict]) -> m.AttributeValue
Use an LLM to determine attributes given nearest neighbor candidates
Args: desc: product description candidates: list of dicts with the following keys: attributes: attributes in dict form e.g. {‘color’:‘green’, ‘pattern’: ‘striped’} description: string describing product
Returns: attributes in dict form e.g. {‘color’:‘green’, ‘pattern’: ‘striped’}
def retrieve_and_generate_attributes(
desc: str,
category: Optional[str] = None,
image: Optional[str] = None,
base64: bool = False,
filters: list[str] = []) -> m.ProductAttributes
RAG approach to generating product attributes.
Since LLM answers are not always well formatted, if we fail to parse the LLM answer we fall back to a greedy retrieval approach.
Args: desc: user provided description of product category: category of the product image: can be local file path, GCS URI or base64 encoded image base64: True indicates image is base64. False (default) will be interpreted as image path (either local or GCS) num_neigbhors: number of nearest neighbors to return for EACH embedding filters: category prefix to restrict results to
Returns: attributes in dict form e.g. {‘color’:‘green’, ‘pattern’: ‘striped’}