Facility teams live in a swirl of data, ranging from BMS alarms, work orders, sensor streams, comfort complaints, energy spikes, daily inspections and compliance checks. Every morning the question is the same: What needs attention today, and in what order? Is vital equipment supporting core systems in the building running as expected? Facility Managers (FMs) can spend several hours each morning navigating multiple tabs in their browsers, as they walk through different systems, including BMS, EMS, and CMMS, just to understand the day’s priorities and if there’s a critical failure in a core system requiring all hands on deck.

Information buried in a myriad of complex systems puts the onus on FMs to pull relevant data together every day. Willow strives to simplify this with a consolidated report informing the team on what needs action. By constantly monitoring indoor environments trending towards uncomfortable, issues can be addressed proactively, before they become noticeable and escalate into complaints. This is refreshing for overloaded teams drowning in notification fatigue. 

Australian property management trust Investa has answered that question with Willow’s new AI-generated Reports feature. Teams can receive an auto-generated daily “pre-start” report to peruse before even grabbing their first cup of coffee. It has everything FMs need to ensure that the building functions properly throughout the day.

Let’s walk through Investa’s approach using Willow. We’ll showcase a sample report, and then unpack the process used to build it. 

The Importance of Repeatability 

Willow’s Reports excel in their ability to summarize complex details. Investa leveraged this to build out a traffic light-style report, making it easy to capture with a glance which areas are green and functioning as expected, and which ones are yellow or red and need attention. However, probabilistic systems also present a challenge. They can generate slightly different outputs even when given the same prompt. This creates tension with expectations set by traditional deterministic algorithms where the same input consistently produces the same output. Designing AIdriven workflows requires reconciling those two worlds. 

Investa started with a clear goal to simplify the prioritization of daily tasks and inspections. It was also important to ensure that the solution with Willow integrated with existing systems and workflows. Across these, consistency and repeatability would be a key factor to ensure control and reliability. That meant ensuring that natural‑language prompts produced responses that were consistent in both qualitative and quantitative aspects. Recommendations would need to be similar, even if stated in different words. Counts and values would have to be consistent, even if the prompt was run a few times in a row. 

With Willow, reliability shows up across the entire workflow. Data surfaced in each report matches the underlying sources in the Knowledge Graph, tickets, insights and real time telemetry. The report is delivered every morning with a summary in email and a link to the full report in Willow. As a result, FMs start their day with a unified summary that filters down the information to what needs action immediately and what they need to monitor through the day. 

Sample Report Output 

Below is a sample of Investa’s AI-generated daily report for FMs in Willow.

Disclaimer: This is a sample report. Responses may vary. Data points listed are examples only. Energy and cost savings are dependent on the number of insights enabled in Willow. 

Investa’s Daily Report Creation Process 

Investa started defining the structure of the report based on FM reality. The team reverse-engineered the morning routine and then recreated a system to support it, end to end. Below is the playbook. 

Step 1. Define Daily Decisions 

Identify what key information FMs must know about at the start of each day. Which data sources do they rely on and in what order do they triage issues? In Investa’s case, status of core building systems surfaces towards the top whereas outstanding inspections show lower in the list. Determine the best way for FMs to receive this information. This may be via SMS or an email summary that can be read on their phones with a link to the full report that can be reviewed on the desktop. A shared vocabulary ensures a common understanding, e.g. ‘critical’ vs ‘high’, as well as terms like ‘near failure’, and ‘comfort variance’. 

This step ensures the briefing solves actual decisions vs providing interesting information. 

Step 2. Select Use Cases 

Spell out a set of key use cases that map to daily workflows, i.e. what needs action or monitoring daily, based on what rules or metrics, and these become sections in the report.  

Operational insights show comfort across building floors followed by traffic light indicators for status across critical plants like AHUs, chillers and hot water systems. This is followed by a review of ambient systems to predict which spaces could become uncomfortable based on weather forecast, and recommended actions. Next comes a schedule of various vendors coming to service the building throughout the day. Then we get into work orders and inspections, for a summary of overdue tasks and invoices. Compliance and audit checks ensure there is visibility into stagnated work items. 

Each use case is defined with inputs → logic → outputs → success criteria, so teams can test and refine with real data. 

Step 3. Data Definitions for Use Cases 

The AI backend powering prompts and responses in Willow can reference the Knowledge Graph, telemetry, tickets, Skills and Insights, combining probabilistic and deterministic workflows. This is key to repeatability of responses. 

For each use case, define what time window to analyze for the response, e.g., last 24 hours, or a rolling view of the prior week for trend. Where needed, identify aggregations or thresholds required to process the dataset. Specify where it makes sense to ask for a short summary or recommendations to be included in the response. 

This step channels data definitions into prompts as they are written next. 

Step 4. Prompt Engineering 

Each section of the report is generated by a specific prompt tuned for FM vocabulary and actionability. The prompts are tailored for target use cases and are data-bound.  

To ensure repeatability of response, prompts are structured with a few key characteristics. Clear context and structure specify if data is to be pulled from telemetry, insights, tickets. Scope identifying a building, assets or points ensures that there are bounds for data in the expected response. The statement is made in plain English as opposed to pseudo code. Prompt length is limited to increase consistency and repeatability. Willow’s Report creation feature allows for multiple short prompts to be sequenced into individual sections where context can be reset. These are stitched together to formulate a longer report. 

A data QA exercise ensures repeatability and consistency in responses. Prompts are run multiple times to test and compare preview responses. They are tweaked to add additional context or reworded as needed.  

Step 5. Pilot and Scale 

Test reports are monitored for a few days to ensure that daily runs return correct and consistent responses. Piloting on a single site is a great place to start. Run the report for a couple of weeks and review for feedback daily. Tweak the prompts as needed, and run parallel reports if needed, to compare results. This is also a great time to bring on a few FMs to beta-test the process and validate impact. 

A successful pilot creates confidence to scale out to other sites. The report can also be leveraged to address additional audiences and personas.  

Closing Thoughts 

With Willow’s AI-generated Reports, Investa transformed a daily ritual from manual stitching to automated clarity. The benefits are tangible. Teams begin the day with shared facts and clear priorities. There are fewer surprises as emerging issues are flagged early with hypotheses and steps. Anomalies are caught and actioned in time. Comfort clusters are identified and addressed proactively, increasing occupant satisfaction. Leaders get a unified view of the entire portfolio and can make informed decisions. 

Efforts made in structuring and validating prompts for repeatability ensure that data in the reports can be trusted. Underlying all this is the tenuous balance of combining the benefit of probabilistic, non-deterministic LLMs with traditional deterministic techniques. The impact is the ability to scale with AI-driven efficiencies, reducing busywork and alarm fatigue so teams can refocus on the strategic work that truly moves the needle.