{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":730932260,"defaultBranch":"main","name":"torchtitan","ownerLogin":"pytorch","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-12-13T01:51:37.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/21003710?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1717631966.0","currentOid":""},"activityList":{"items":[{"before":"51825cafd92e0ae18a887e57a2dcf0c414ce9765","after":"22a749029444caba2b68a344205eba67bc7c6215","ref":"refs/heads/gh/XilunWu/2/orig","pushedAt":"2024-06-09T00:40:25.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"XilunWu","name":"Xilun Wu","path":"/XilunWu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/12968408?s=80&v=4"},"commit":{"message":"enable TritonFusedRMSNorm with local_map annotation\n\nghstack-source-id: 213ef4323f9888463076ea580c3b72e2359ec492\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/364","shortMessageHtmlLink":"enable TritonFusedRMSNorm with local_map annotation"}},{"before":"71659de492ae262efcdaf2860d4d16db9ee3715a","after":"2cf57abe736ad60226d31cc56f0d3a848362462d","ref":"refs/heads/gh/XilunWu/2/head","pushedAt":"2024-06-09T00:40:23.000Z","pushType":"push","commitsCount":16,"pusher":{"login":"XilunWu","name":"Xilun Wu","path":"/XilunWu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/12968408?s=80&v=4"},"commit":{"message":"Update on \"enable TritonFusedRMSNorm with local_map annotation\"\n\n\r\n**Test Plan**\r\nHere's the output of running `CONFIG_FILE=./train_configs/llama3_8b.toml NGPU=8 LOG_RANK=0,1,2,3,4,5,6,7 ./run_llama_train.sh` using 4-way Tensor Parallel (`tensor_parallel_degree = 4`):\r\n1. with `norm_type = \"rmsnorm\"`\r\n```\r\n[rank0]:2024-06-05 11:57:35,505 - root - INFO - step: 1 loss: 12.2703 memory: 24.66GiB(31.15%) wps: 143 mfu: 2.66%\r\n[rank0]:2024-06-05 11:57:35,505 - root - INFO - Synchronizing and adjusting timeout for all ProcessGroups to 0:01:40\r\n[rank0]:2024-06-05 11:58:11,490 - root - INFO - step: 10 loss: 11.0446 memory: 31.96GiB(40.37%) wps: 512 mfu: 9.51%\r\n[rank0]:2024-06-05 11:58:46,488 - root - INFO - step: 20 loss: 9.2321 memory: 31.96GiB(40.37%) wps: 586 mfu: 10.87%\r\n[rank0]:2024-06-05 11:59:22,462 - root - INFO - step: 30 loss: 8.2184 memory: 31.96GiB(40.37%) wps: 570 mfu: 10.58%\r\n[rank0]:2024-06-05 11:59:57,301 - root - INFO - step: 40 loss: 7.6220 memory: 31.96GiB(40.37%) wps: 589 mfu: 10.93%\r\n[rank0]:2024-06-05 12:00:32,254 - root - INFO - step: 50 loss: 7.5399 memory: 31.96GiB(40.37%) wps: 587 mfu: 10.89%\r\n[rank0]:2024-06-05 12:01:07,155 - root - INFO - step: 60 loss: 7.3179 memory: 31.96GiB(40.37%) wps: 588 mfu: 10.91%\r\n[rank0]:2024-06-05 12:01:41,999 - root - INFO - step: 70 loss: 7.3508 memory: 31.96GiB(40.37%) wps: 589 mfu: 10.92%\r\n[rank0]:2024-06-05 12:02:17,093 - root - INFO - step: 80 loss: 7.2696 memory: 31.96GiB(40.37%) wps: 584 mfu: 10.85%\r\n[rank0]:2024-06-05 12:02:52,009 - root - INFO - step: 90 loss: 7.0481 memory: 31.96GiB(40.37%) wps: 588 mfu: 10.91%\r\n[rank0]:2024-06-05 12:03:27,715 - root - INFO - step: 100 loss: 6.9623 memory: 31.96GiB(40.37%) wps: 575 mfu: 10.67%\r\n```\r\n\r\n3. with `norm_type = \"fused_rmsnorm\"`\r\n```[rank0]:2024-06-05 12:08:35,004 - root - INFO - step: 1 loss: 12.2422 memory: 24.62GiB(31.10%) wps: 95 mfu: 1.76%\r\n[rank0]:2024-06-05 12:08:35,004 - root - INFO - Synchronizing and adjusting timeout for all ProcessGroups to 0:01:40\r\n[rank0]:2024-06-05 12:09:12,401 - root - INFO - step: 10 loss: 11.0361 memory: 32.09GiB(40.54%) wps: 493 mfu: 9.15%\r\n[rank0]:2024-06-05 12:09:49,380 - root - INFO - step: 20 loss: 9.2725 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:10:26,645 - root - INFO - step: 30 loss: 8.2091 memory: 32.09GiB(40.54%) wps: 550 mfu: 10.21%\r\n[rank0]:2024-06-05 12:11:03,616 - root - INFO - step: 40 loss: 7.5601 memory: 32.09GiB(40.54%) wps: 555 mfu: 10.30%\r\n[rank0]:2024-06-05 12:11:40,625 - root - INFO - step: 50 loss: 7.5144 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:12:17,768 - root - INFO - step: 60 loss: 7.3869 memory: 32.09GiB(40.54%) wps: 552 mfu: 10.25%\r\n[rank0]:2024-06-05 12:12:54,820 - root - INFO - step: 70 loss: 7.3358 memory: 32.09GiB(40.54%) wps: 553 mfu: 10.27%\r\n[rank0]:2024-06-05 12:13:31,817 - root - INFO - step: 80 loss: 7.2085 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:14:09,156 - root - INFO - step: 90 loss: 7.0140 memory: 32.09GiB(40.54%) wps: 549 mfu: 10.19%\r\n[rank0]:2024-06-05 12:14:48,518 - root - INFO - step: 100 loss: 6.9507 memory: 32.09GiB(40.54%) wps: 521 mfu: 9.67%```\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update on \"enable TritonFusedRMSNorm with local_map annotation\""}},{"before":"1ceaa4e2adc8ef5a0864f99e126e4ab18cd7db8f","after":"66c7c93a8236b17518834d2ac490892e0016a51e","ref":"refs/heads/gh/XilunWu/2/base","pushedAt":"2024-06-09T00:40:20.000Z","pushType":"push","commitsCount":15,"pusher":{"login":"XilunWu","name":"Xilun Wu","path":"/XilunWu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/12968408?s=80&v=4"},"commit":{"message":"Update base for Update on \"enable TritonFusedRMSNorm with local_map annotation\"\n\n\r\n**Test Plan**\r\nHere's the output of running `CONFIG_FILE=./train_configs/llama3_8b.toml NGPU=8 LOG_RANK=0,1,2,3,4,5,6,7 ./run_llama_train.sh` using 4-way Tensor Parallel (`tensor_parallel_degree = 4`):\r\n1. with `norm_type = \"rmsnorm\"`\r\n```\r\n[rank0]:2024-06-05 11:57:35,505 - root - INFO - step: 1 loss: 12.2703 memory: 24.66GiB(31.15%) wps: 143 mfu: 2.66%\r\n[rank0]:2024-06-05 11:57:35,505 - root - INFO - Synchronizing and adjusting timeout for all ProcessGroups to 0:01:40\r\n[rank0]:2024-06-05 11:58:11,490 - root - INFO - step: 10 loss: 11.0446 memory: 31.96GiB(40.37%) wps: 512 mfu: 9.51%\r\n[rank0]:2024-06-05 11:58:46,488 - root - INFO - step: 20 loss: 9.2321 memory: 31.96GiB(40.37%) wps: 586 mfu: 10.87%\r\n[rank0]:2024-06-05 11:59:22,462 - root - INFO - step: 30 loss: 8.2184 memory: 31.96GiB(40.37%) wps: 570 mfu: 10.58%\r\n[rank0]:2024-06-05 11:59:57,301 - root - INFO - step: 40 loss: 7.6220 memory: 31.96GiB(40.37%) wps: 589 mfu: 10.93%\r\n[rank0]:2024-06-05 12:00:32,254 - root - INFO - step: 50 loss: 7.5399 memory: 31.96GiB(40.37%) wps: 587 mfu: 10.89%\r\n[rank0]:2024-06-05 12:01:07,155 - root - INFO - step: 60 loss: 7.3179 memory: 31.96GiB(40.37%) wps: 588 mfu: 10.91%\r\n[rank0]:2024-06-05 12:01:41,999 - root - INFO - step: 70 loss: 7.3508 memory: 31.96GiB(40.37%) wps: 589 mfu: 10.92%\r\n[rank0]:2024-06-05 12:02:17,093 - root - INFO - step: 80 loss: 7.2696 memory: 31.96GiB(40.37%) wps: 584 mfu: 10.85%\r\n[rank0]:2024-06-05 12:02:52,009 - root - INFO - step: 90 loss: 7.0481 memory: 31.96GiB(40.37%) wps: 588 mfu: 10.91%\r\n[rank0]:2024-06-05 12:03:27,715 - root - INFO - step: 100 loss: 6.9623 memory: 31.96GiB(40.37%) wps: 575 mfu: 10.67%\r\n```\r\n\r\n3. with `norm_type = \"fused_rmsnorm\"`\r\n```[rank0]:2024-06-05 12:08:35,004 - root - INFO - step: 1 loss: 12.2422 memory: 24.62GiB(31.10%) wps: 95 mfu: 1.76%\r\n[rank0]:2024-06-05 12:08:35,004 - root - INFO - Synchronizing and adjusting timeout for all ProcessGroups to 0:01:40\r\n[rank0]:2024-06-05 12:09:12,401 - root - INFO - step: 10 loss: 11.0361 memory: 32.09GiB(40.54%) wps: 493 mfu: 9.15%\r\n[rank0]:2024-06-05 12:09:49,380 - root - INFO - step: 20 loss: 9.2725 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:10:26,645 - root - INFO - step: 30 loss: 8.2091 memory: 32.09GiB(40.54%) wps: 550 mfu: 10.21%\r\n[rank0]:2024-06-05 12:11:03,616 - root - INFO - step: 40 loss: 7.5601 memory: 32.09GiB(40.54%) wps: 555 mfu: 10.30%\r\n[rank0]:2024-06-05 12:11:40,625 - root - INFO - step: 50 loss: 7.5144 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:12:17,768 - root - INFO - step: 60 loss: 7.3869 memory: 32.09GiB(40.54%) wps: 552 mfu: 10.25%\r\n[rank0]:2024-06-05 12:12:54,820 - root - INFO - step: 70 loss: 7.3358 memory: 32.09GiB(40.54%) wps: 553 mfu: 10.27%\r\n[rank0]:2024-06-05 12:13:31,817 - root - INFO - step: 80 loss: 7.2085 memory: 32.09GiB(40.54%) wps: 554 mfu: 10.29%\r\n[rank0]:2024-06-05 12:14:09,156 - root - INFO - step: 90 loss: 7.0140 memory: 32.09GiB(40.54%) wps: 549 mfu: 10.19%\r\n[rank0]:2024-06-05 12:14:48,518 - root - INFO - step: 100 loss: 6.9507 memory: 32.09GiB(40.54%) wps: 521 mfu: 9.67%```\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update base for Update on \"enable TritonFusedRMSNorm with local_map a…"}},{"before":"baa678ca7a7b1d73494ed9d3c22b858b96cea1c0","after":"104bd6c5f8fa91dec3806e105a13bbb82fcc5b35","ref":"refs/heads/main","pushedAt":"2024-06-07T19:58:11.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"wanchaol","name":"Wanchao","path":"/wanchaol","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9443650?s=80&v=4"},"commit":{"message":"Abstract out out optimizer params and update foreach calling convention (#386)\n\n# Summary\r\nUpdates the behavior to call foreach when we are not using fused for the\r\noptimizer","shortMessageHtmlLink":"Abstract out out optimizer params and update foreach calling conventi…"}},{"before":"7cf41bbb65b593e6f47f65c4cc5df5c99043e0fd","after":"baa678ca7a7b1d73494ed9d3c22b858b96cea1c0","ref":"refs/heads/main","pushedAt":"2024-06-06T22:32:48.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"wz337","name":"Iris Z","path":"/wz337","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/31293777?s=80&v=4"},"commit":{"message":"[torchtitan] Fix test runner fused optim tests (#384)","shortMessageHtmlLink":"[torchtitan] Fix test runner fused optim tests (#384)"}},{"before":"3bc767897426ffde0cdc6bbb24ad437a1dd09224","after":"7cf41bbb65b593e6f47f65c4cc5df5c99043e0fd","ref":"refs/heads/main","pushedAt":"2024-06-06T18:41:19.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"wz337","name":"Iris Z","path":"/wz337","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/31293777?s=80&v=4"},"commit":{"message":"[torchtitan][optim] Add fused as an option in train config (#355)\n\nWith these three PRs landed, we can now support the option fused=True in\r\ntorchtitan for Adam and AdamW optimizer.\r\n\r\nhttps://github.com/pytorch/pytorch/pull/125369\r\nhttps://github.com/pytorch/pytorch/pull/126423\r\nhttps://github.com/pytorch/pytorch/pull/126750\r\n\r\nRun performance evaluation on 8 A100 DevGPU: 1000 steps on 1D DP default\r\n[llama_8b.toml](https://github.com/pytorch/torchtitan/blob/main/train_configs/llama3_8b.toml).\r\n\r\nObservation: \r\nFor `fused = True` and `fused = False`, we observed similar loss curve\r\nand memory usage.\r\nwps is + ~100 and mfu is + 1.5-2% when fused = True. \r\n\r\nBelow are the logs for the last 100 steps for both.\r\n```\r\n**Fused = False**\r\n[rank0]:2024-06-05 12:45:06,227 - root - INFO - Finished dumping traces in 0.37 seconds\r\n[rank0]:2024-06-05 12:45:37,677 - root - INFO - step: 910 loss: 4.6039 memory: 59.48GiB(75.15%) wps: 2,217 mfu: 41.16%\r\n[rank0]:2024-06-05 12:46:08,843 - root - INFO - step: 920 loss: 4.6427 memory: 59.48GiB(75.15%) wps: 2,632 mfu: 48.85%\r\n[rank0]:2024-06-05 12:46:40,052 - root - INFO - step: 930 loss: 4.6339 memory: 59.48GiB(75.15%) wps: 2,628 mfu: 48.78%\r\n[rank0]:2024-06-05 12:47:11,243 - root - INFO - step: 940 loss: 4.5964 memory: 59.48GiB(75.15%) wps: 2,631 mfu: 48.84%\r\n[rank0]:2024-06-05 12:47:42,655 - root - INFO - step: 950 loss: 4.6477 memory: 59.48GiB(75.15%) wps: 2,611 mfu: 48.47%\r\n[rank0]:2024-06-05 12:48:13,890 - root - INFO - step: 960 loss: 4.8137 memory: 59.48GiB(75.15%) wps: 2,626 mfu: 48.75%\r\n[rank0]:2024-06-05 12:48:45,110 - root - INFO - step: 970 loss: 4.5962 memory: 59.48GiB(75.15%) wps: 2,628 mfu: 48.78%\r\n[rank0]:2024-06-05 12:49:16,333 - root - INFO - step: 980 loss: 4.5450 memory: 59.48GiB(75.15%) wps: 2,627 mfu: 48.76%\r\n[rank0]:2024-06-05 12:49:47,561 - root - INFO - step: 990 loss: 4.5840 memory: 59.48GiB(75.15%) wps: 2,627 mfu: 48.76%\r\n[rank0]:2024-06-05 12:50:18,933 - root - INFO - step: 1000 loss: 4.5351 memory: 59.48GiB(75.15%) wps: 2,615 mfu: 48.53%\r\n[rank0]:2024-06-05 12:50:23,692 - root - INFO - Dumping traces at step 1000\r\n[rank0]:2024-06-05 12:50:24,041 - root - INFO - Finished dumping traces in 0.35 seconds\r\n[rank0]:2024-06-05 12:50:24,422 - root - INFO - Sleeping 2 seconds for other ranks to complete\r\n[rank0]:2024-06-05 12:50:26,424 - root - INFO - Training completed\r\n\r\n**Fused = True**\r\n[rank0]:2024-06-05 14:55:42,894 - root - INFO - Finished dumping traces in 0.30 seconds\r\n[rank0]:2024-06-05 14:56:13,582 - root - INFO - step: 910 loss: 4.6091 memory: 59.48GiB(75.15%) wps: 2,341 mfu: 43.46%\r\n[rank0]:2024-06-05 14:56:43,765 - root - INFO - step: 920 loss: 4.6468 memory: 59.48GiB(75.15%) wps: 2,718 mfu: 50.45%\r\n[rank0]:2024-06-05 14:57:13,971 - root - INFO - step: 930 loss: 4.6365 memory: 59.48GiB(75.15%) wps: 2,715 mfu: 50.40%\r\n[rank0]:2024-06-05 14:57:44,172 - root - INFO - step: 940 loss: 4.6021 memory: 59.48GiB(75.15%) wps: 2,716 mfu: 50.41%\r\n[rank0]:2024-06-05 14:58:14,353 - root - INFO - step: 950 loss: 4.6522 memory: 59.48GiB(75.15%) wps: 2,718 mfu: 50.45%\r\n[rank0]:2024-06-05 14:58:44,536 - root - INFO - step: 960 loss: 4.8163 memory: 59.48GiB(75.15%) wps: 2,717 mfu: 50.44%\r\n[rank0]:2024-06-05 14:59:14,683 - root - INFO - step: 970 loss: 4.6026 memory: 59.48GiB(75.15%) wps: 2,721 mfu: 50.51%\r\n[rank0]:2024-06-05 14:59:44,840 - root - INFO - step: 980 loss: 4.5491 memory: 59.48GiB(75.15%) wps: 2,720 mfu: 50.49%\r\n[rank0]:2024-06-05 15:00:15,009 - root - INFO - step: 990 loss: 4.5859 memory: 59.48GiB(75.15%) wps: 2,719 mfu: 50.47%\r\n[rank0]:2024-06-05 15:00:45,228 - root - INFO - step: 1000 loss: 4.5396 memory: 59.48GiB(75.15%) wps: 2,714 mfu: 50.38%\r\n[rank0]:2024-06-05 15:00:49,455 - root - INFO - Dumping traces at step 1000\r\n[rank0]:2024-06-05 15:00:49,756 - root - INFO - Finished dumping traces in 0.30 seconds\r\n[rank0]:2024-06-05 15:00:50,336 - root - INFO - Sleeping 2 seconds for other ranks to complete\r\n[rank0]:2024-06-05 15:00:52,339 - root - INFO - Training completed\r\n```","shortMessageHtmlLink":"[torchtitan][optim] Add fused as an option in train config (#355)"}},{"before":"ff501be7e8878eb2df742d3cee4b344dd1d4597c","after":null,"ref":"refs/heads/gh/wconstab/25/orig","pushedAt":"2024-06-05T23:59:26.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"241eabcac99f286bbeacc001894f39e7efb172a9","after":null,"ref":"refs/heads/gh/wconstab/25/head","pushedAt":"2024-06-05T23:59:26.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"241eabcac99f286bbeacc001894f39e7efb172a9","after":null,"ref":"refs/heads/gh/wconstab/25/base","pushedAt":"2024-06-05T23:59:26.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"a6c304712b83b8993dacc4a36e90cda6b15402c0","after":null,"ref":"refs/heads/gh/fegin/3/orig","pushedAt":"2024-06-05T23:59:23.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"f50171472f0227dcae72edbe8228c809775e1b38","after":null,"ref":"refs/heads/gh/fegin/3/head","pushedAt":"2024-06-05T23:59:23.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"f50171472f0227dcae72edbe8228c809775e1b38","after":null,"ref":"refs/heads/gh/fegin/3/base","pushedAt":"2024-06-05T23:59:23.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"}},{"before":"0594c048403d8c5ebbd6c895c44264df2edb44c4","after":"3bc767897426ffde0cdc6bbb24ad437a1dd09224","ref":"refs/heads/main","pushedAt":"2024-06-05T23:59:20.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Add 3D support\n\nEnables PP+DP+TP and adds CI test case that runs on 8-gpu CI runner.\n\nghstack-source-id: 7e2d6879d39e78fc7e6d46fd775bb6dfe08ff708\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/344","shortMessageHtmlLink":"Add 3D support"}},{"before":"53af464e454dad3440fd3032bfdef349f82722c1","after":"241eabcac99f286bbeacc001894f39e7efb172a9","ref":"refs/heads/gh/wconstab/25/base","pushedAt":"2024-06-05T23:59:18.000Z","pushType":"push","commitsCount":18,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"89e588eb9ae69cc90d62f5171237cf2a6b61cb81","after":"f50171472f0227dcae72edbe8228c809775e1b38","ref":"refs/heads/gh/fegin/3/base","pushedAt":"2024-06-05T23:59:16.000Z","pushType":"push","commitsCount":7,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"21b8a9e0224e0c8e908f2a82b6adc9f6a1464af3","after":"a6c304712b83b8993dacc4a36e90cda6b15402c0","ref":"refs/heads/gh/fegin/3/orig","pushedAt":"2024-06-05T07:46:50.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"fegin","name":"Chien-Chin Huang","path":"/fegin","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2461448?s=80&v=4"},"commit":{"message":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple models\nand optimizers\n\nghstack-source-id: 190220813ece188728a3c776e6839a323009f719\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/360","shortMessageHtmlLink":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple models"}},{"before":"9f8e186617f0e78bd268946680c3003f80e75b9b","after":"f50171472f0227dcae72edbe8228c809775e1b38","ref":"refs/heads/gh/fegin/3/head","pushedAt":"2024-06-05T07:46:47.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"fegin","name":"Chien-Chin Huang","path":"/fegin","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2461448?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"bc65295e6fdedbc093363d1e57fccb25113dbaec","after":"89e588eb9ae69cc90d62f5171237cf2a6b61cb81","ref":"refs/heads/gh/fegin/3/base","pushedAt":"2024-06-05T07:46:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"fegin","name":"Chien-Chin Huang","path":"/fegin","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2461448?s=80&v=4"},"commit":{"message":"Update (base update)\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update (base update)"}},{"before":"15d5fb7206b176d57d493cdf1e26377ea43a4592","after":"21b8a9e0224e0c8e908f2a82b6adc9f6a1464af3","ref":"refs/heads/gh/fegin/3/orig","pushedAt":"2024-06-04T23:55:59.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple models and optimizers\n\nghstack-source-id: f9a6fd6039a07617edf76a2447413726039a7400\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/360","shortMessageHtmlLink":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple mode…"}},{"before":"3b7f2bbf36744fb8f23f79c254df37b60ed751a5","after":"ff501be7e8878eb2df742d3cee4b344dd1d4597c","ref":"refs/heads/gh/wconstab/25/orig","pushedAt":"2024-06-04T23:55:59.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Add 3D support\n\nEnables PP+DP+TP and adds CI test case that runs on 8-gpu CI runner.\n\nghstack-source-id: 7e2d6879d39e78fc7e6d46fd775bb6dfe08ff708\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/344","shortMessageHtmlLink":"Add 3D support"}},{"before":"257c770f1ed205a9db7bbfb6b06f72e1e5e9ff18","after":"241eabcac99f286bbeacc001894f39e7efb172a9","ref":"refs/heads/gh/wconstab/25/head","pushedAt":"2024-06-04T23:55:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"e8b0fa94af75ab37ccd885ad587b01603b797d50","after":"51825cafd92e0ae18a887e57a2dcf0c414ce9765","ref":"refs/heads/gh/XilunWu/2/orig","pushedAt":"2024-06-04T21:41:28.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"XilunWu","name":"Xilun Wu","path":"/XilunWu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/12968408?s=80&v=4"},"commit":{"message":"enable TritonFusedRMSNorm with local_map annotation\n\nghstack-source-id: 6125011aba1a4bd9521fb4a3b761b62285ea6195\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/364","shortMessageHtmlLink":"enable TritonFusedRMSNorm with local_map annotation"}},{"before":"aa5af1be6773abadea9f674f6120a7c43042d3db","after":"71659de492ae262efcdaf2860d4d16db9ee3715a","ref":"refs/heads/gh/XilunWu/2/head","pushedAt":"2024-06-04T21:41:26.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"XilunWu","name":"Xilun Wu","path":"/XilunWu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/12968408?s=80&v=4"},"commit":{"message":"Update on \"enable TritonFusedRMSNorm with local_map annotation\"\n\n\n\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update on \"enable TritonFusedRMSNorm with local_map annotation\""}},{"before":"6df5409442aa2c1178bbb5686df9032c4903ccf1","after":"15d5fb7206b176d57d493cdf1e26377ea43a4592","ref":"refs/heads/gh/fegin/3/orig","pushedAt":"2024-06-04T20:04:47.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple models and optimizers\n\nghstack-source-id: f9a6fd6039a07617edf76a2447413726039a7400\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/360","shortMessageHtmlLink":"[RFC] Allow ModelWrapper and OptimizerWrapper to accept multiple mode…"}},{"before":"c77f338b83891ce512748bd7da45c7453559e24d","after":"3b7f2bbf36744fb8f23f79c254df37b60ed751a5","ref":"refs/heads/gh/wconstab/25/orig","pushedAt":"2024-06-04T20:04:47.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Add 3D support\n\nEnables PP+DP+TP and adds CI test case that runs on 8-gpu CI runner.\n\nghstack-source-id: a8976b0c25fd6a268bec9140694f7f1a3f2c8624\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/344","shortMessageHtmlLink":"Add 3D support"}},{"before":"088def6a44a8f0568de2c0344eaf348718c6c6ed","after":"257c770f1ed205a9db7bbfb6b06f72e1e5e9ff18","ref":"refs/heads/gh/wconstab/25/head","pushedAt":"2024-06-04T20:04:44.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"a04b7882393c18aa5ee1dfb1744b61c9b02b0fcd","after":"9f8e186617f0e78bd268946680c3003f80e75b9b","ref":"refs/heads/gh/fegin/3/head","pushedAt":"2024-06-04T20:04:44.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update"}},{"before":"2ee5f13656ad1e39ca9d1e1af65e93dfa699311a","after":"53af464e454dad3440fd3032bfdef349f82722c1","ref":"refs/heads/gh/wconstab/25/base","pushedAt":"2024-06-04T20:04:42.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update (base update)\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update (base update)"}},{"before":"a98f22ec1eed4fc2c0a1e88bd0c48fe14e4476ae","after":"bc65295e6fdedbc093363d1e57fccb25113dbaec","ref":"refs/heads/gh/fegin/3/base","pushedAt":"2024-06-04T20:04:42.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Update (base update)\n\n[ghstack-poisoned]","shortMessageHtmlLink":"Update (base update)"}},{"before":"24419f74f8c711dd6b0b101a13319d7bc9f0eb01","after":"c77f338b83891ce512748bd7da45c7453559e24d","ref":"refs/heads/gh/wconstab/25/orig","pushedAt":"2024-06-04T18:04:04.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"wconstab","name":"Will Constable","path":"/wconstab","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4984825?s=80&v=4"},"commit":{"message":"Add 3D support\n\nEnables PP+DP+TP and adds CI test case that runs on 8-gpu CI runner.\n\nghstack-source-id: fa253a51640b92e561a6b9322828a1ab6c155fde\nPull Request resolved: https://github.com/pytorch/torchtitan/pull/344","shortMessageHtmlLink":"Add 3D support"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEYCThUwA","startCursor":null,"endCursor":null}},"title":"Activity · pytorch/torchtitan"}