Productivity

Small time savings were observed across most use cases, with written tasks presenting the largest time savings. However, some tasks, like scheduling and generating images, incurred additional time to complete the task when participants used M365 Copilot. Additional time to complete tasks was primarily caused by either M365 Copilot being unable to produce high quality outputs or the task being additional workload only completed due to users having M365 Copilot.

Some of us lean on AI coding to push side projects faster into the delivery pipeline. These are not core product features but experiments and MVP-style initiatives. For bringing that kind of work to its first version, the speed-up is real. … output quality gets worse the more context you add. The model starts pulling in irrelevant details from earlier prompts, and accuracy drops. … AI can get you 70% of the way, but the last 30% is the hard part.

“When we talk about acceptance rate, a lot of the metrics that were popularized early on were metrics that were meant to show whether or not the tools were fit for purpose, not to measure the impact of them across an organization,” she explained. Seeing lots of “acceptance” rate and “% of code written” here’s the deal. we solved the problem. its outcomes. how many shippable units that delight your customers.

Small time savings were observed across most use cases, with written tasks presenting the largest time savings. However, some tasks, like scheduling and generating images, incurred additional time to complete the task when participants used M365 Copilot. Additional time to complete tasks was primarily caused by either M365 Copilot being unable to produce high quality outputs or the task being additional workload only completed due to users having M365 Copilot.

At Meta, a diff is a pull request and DAT focuses on the inner development loop – the writing, building, testing, and debugging of code. The tech giant emphasizes diffs should be kept small and reviewable, which, in part, accounts for its average DAT of 50 minutes across 87% of available diffs. I feel like it’s the continuous application of scientific management (Taylorism) here that fall short. Gallup has found time and again that effective workers have a lot more to do with environment than DATs.

Though each experiment is noisy, when data is combined across three experiments and 4,867 developers, our analysis reveals a 26.08% increase (SE: 10.3%) in completed tasks among developers using the AI tool. Notably, less experienced developers had higher adoption rates and greater productivity gains. … We find that Copilot significantly raises task completion for more recent hires and those in more junior positions but not for developers with longer tenure and in more senior positions.

Author: Department for Business and Trade

Author: Lisa Dziuba

Author: Chantal Kapani

Author: UK Gov

Author: Jennifer Riggins

Author: Kevin Zheyuan Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz